<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Fran</title>
    <description>The latest articles on Forem by Fran (@fransys).</description>
    <link>https://forem.com/fransys</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3762983%2Fe8b53b27-3fa9-43f4-9cfb-d61c40fde008.jpg</url>
      <title>Forem: Fran</title>
      <link>https://forem.com/fransys</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/fransys"/>
    <language>en</language>
    <item>
      <title>I Built an AI That Actually Remembers You — Here's a 4-Minute Demo</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Fri, 03 Apr 2026 09:38:02 +0000</pubDate>
      <link>https://forem.com/fransys/i-built-an-ai-that-actually-remembers-you-heres-a-4-minute-demo-2789</link>
      <guid>https://forem.com/fransys/i-built-an-ai-that-actually-remembers-you-heres-a-4-minute-demo-2789</guid>
      <description>&lt;p&gt;Every AI conversation starts from zero. I built Alma to change that.&lt;/p&gt;

&lt;p&gt;Alma is a full AI workspace with persistent memory — it learns from your conversations and uses that context across every interaction.&lt;/p&gt;

&lt;p&gt;I just published a 4-minute product demo showing everything in action:&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/XeIxJLzcRLc"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  What you'll see
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Persistent Memory&lt;/strong&gt; — Alma extracts facts, decisions, and patterns from your conversations. Each memory gets a confidence score and category. In the demo, you can see it retrieve 15+ memories in real time to answer a complex planning prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Soul Engine&lt;/strong&gt; — Define your AI's personality with structured blocks: identity, rules, worldview, tensions, anti-patterns, communication modes. Not a flat text dump — a real identity system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Video Studio&lt;/strong&gt; — Generate professional videos with Runway Gen-4 and Gen-4.5. Choose style, camera movement, duration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Image Studio&lt;/strong&gt; — Create images with Flux Pro in 10+ styles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Writing Tools&lt;/strong&gt; — 7 AI transformations: summarize, humanize, grammar check, translate, expand, simplify, change tone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Web Search&lt;/strong&gt; — Three levels of depth with AI-powered summaries and cited sources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plus:&lt;/strong&gt; Documents, Ideas, Trends &amp;amp; News, Dashboard, Command Palette, 6 specialized Skills.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Cloudflare Workers (Hono)&lt;/li&gt;
&lt;li&gt;Cloudflare D1 + Vectorize + Durable Objects + Queues&lt;/li&gt;
&lt;li&gt;React 19 + Vite 6 + Tailwind 4&lt;/li&gt;
&lt;li&gt;Anthropic Claude (Haiku/Sonnet/Opus)&lt;/li&gt;
&lt;li&gt;2,964 tests, 100+ API endpoints, 15 languages&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Free tier available — no credit card required.&lt;/p&gt;

&lt;p&gt;🌐 &lt;strong&gt;Web:&lt;/strong&gt; &lt;a href="https://alma.olivares.ai" rel="noopener noreferrer"&gt;alma.olivares.ai&lt;/a&gt;&lt;br&gt;
📦 &lt;strong&gt;SDK:&lt;/strong&gt; &lt;code&gt;npm install @olivaresai/alma-sdk&lt;/code&gt;&lt;br&gt;
🔌 &lt;strong&gt;MCP:&lt;/strong&gt; &lt;code&gt;npm install @olivaresai/alma-mcp&lt;/code&gt;&lt;br&gt;
💻 &lt;strong&gt;VSCode:&lt;/strong&gt; Search "Alma" in Extensions&lt;/p&gt;




&lt;p&gt;Would love to hear your thoughts. What would you want your AI to remember?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I Tested 5 AI Memory Tools So You Don't Have To (2026 Comparison)</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Tue, 31 Mar 2026 12:57:20 +0000</pubDate>
      <link>https://forem.com/fransys/i-tested-5-ai-memory-tools-so-you-dont-have-to-2026-comparison-2ode</link>
      <guid>https://forem.com/fransys/i-tested-5-ai-memory-tools-so-you-dont-have-to-2026-comparison-2ode</guid>
      <description>&lt;p&gt;AI memory is the hottest infrastructure category of 2026. I tested the top 5 tools as both a developer and a daily user. Here's what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tools I tested
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Mem0&lt;/strong&gt; — The market leader ($24M raised, 80K developers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zep&lt;/strong&gt; — Temporal knowledge graphs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Letta&lt;/strong&gt; — Agent runtime with self-editing memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SuperMemory&lt;/strong&gt; — All-in-one memory + RAG&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alma&lt;/strong&gt; — End-user product with memory (full disclosure: I built this one)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Test methodology
&lt;/h2&gt;

&lt;p&gt;I used each tool for 1 week with the same workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily coding conversations (TypeScript, React, Cloudflare Workers)&lt;/li&gt;
&lt;li&gt;Project planning sessions&lt;/li&gt;
&lt;li&gt;Writing and brainstorming&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I evaluated: setup time, memory accuracy, retrieval quality, daily usability, and pricing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Setup time
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Setup&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mem0&lt;/td&gt;
&lt;td&gt;15 min&lt;/td&gt;
&lt;td&gt;pip install + API key. Clean SDK.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zep&lt;/td&gt;
&lt;td&gt;30 min&lt;/td&gt;
&lt;td&gt;Docker compose or cloud signup. More config.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Letta&lt;/td&gt;
&lt;td&gt;45 min&lt;/td&gt;
&lt;td&gt;Full agent runtime. Steeper learning curve.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SuperMemory&lt;/td&gt;
&lt;td&gt;5 min&lt;/td&gt;
&lt;td&gt;Cloud-only. Fastest setup.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alma&lt;/td&gt;
&lt;td&gt;2 min&lt;/td&gt;
&lt;td&gt;Web signup. MCP install for Claude: 1 command.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Memory accuracy after 1 week
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Memories stored&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;False positives&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mem0&lt;/td&gt;
&lt;td&gt;47&lt;/td&gt;
&lt;td&gt;78%&lt;/td&gt;
&lt;td&gt;10 (generic/obvious)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zep&lt;/td&gt;
&lt;td&gt;31&lt;/td&gt;
&lt;td&gt;85%&lt;/td&gt;
&lt;td&gt;4 (entity-focused)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Letta&lt;/td&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;td&gt;82%&lt;/td&gt;
&lt;td&gt;5 (agent-curated)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SuperMemory&lt;/td&gt;
&lt;td&gt;52&lt;/td&gt;
&lt;td&gt;71%&lt;/td&gt;
&lt;td&gt;16 (over-extracts)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alma&lt;/td&gt;
&lt;td&gt;38&lt;/td&gt;
&lt;td&gt;87%&lt;/td&gt;
&lt;td&gt;5 (confidence scoring helps)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; More memories ≠ better. SuperMemory stored the most but had the lowest accuracy because it over-extracted. Alma's confidence scoring (1.0 = user stated, 0.7 = inferred, 0.5 = observed) let me quickly filter out noise. Zep's entity focus was precise but missed conversational context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Retrieval quality
&lt;/h3&gt;

&lt;p&gt;When I asked "What framework am I using?" after discussing Next.js in week 1:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Found it?&lt;/th&gt;
&lt;th&gt;Response quality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mem0&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Returned raw memory: "Uses Next.js"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zep&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Rich: "Next.js e-commerce project, started March 2026"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Letta&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Agent summarized: "Your main project uses Next.js with Stripe"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SuperMemory&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Found Next.js but also returned 5 irrelevant memories&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alma&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Context-aware: assembled Soul + relevant memories + recent episodes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Daily usability (as an end user, not a developer)
&lt;/h3&gt;

&lt;p&gt;This is where the tools diverge completely:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Can I see my memories?&lt;/th&gt;
&lt;th&gt;Can I edit/delete?&lt;/th&gt;
&lt;th&gt;Can I search?&lt;/th&gt;
&lt;th&gt;Has a UI?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mem0&lt;/td&gt;
&lt;td&gt;Via API/dashboard&lt;/td&gt;
&lt;td&gt;Via API&lt;/td&gt;
&lt;td&gt;Via API&lt;/td&gt;
&lt;td&gt;Dashboard (basic)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zep&lt;/td&gt;
&lt;td&gt;Via API&lt;/td&gt;
&lt;td&gt;Via API&lt;/td&gt;
&lt;td&gt;Via API&lt;/td&gt;
&lt;td&gt;No native UI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Letta&lt;/td&gt;
&lt;td&gt;Agent decides&lt;/td&gt;
&lt;td&gt;Agent decides&lt;/td&gt;
&lt;td&gt;Via agent&lt;/td&gt;
&lt;td&gt;Dev UI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SuperMemory&lt;/td&gt;
&lt;td&gt;Dashboard&lt;/td&gt;
&lt;td&gt;Dashboard&lt;/td&gt;
&lt;td&gt;Dashboard&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alma&lt;/td&gt;
&lt;td&gt;Full UI&lt;/td&gt;
&lt;td&gt;Full UI&lt;/td&gt;
&lt;td&gt;Keyword + semantic&lt;/td&gt;
&lt;td&gt;Full app&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;If you're a developer integrating memory into your product&lt;/strong&gt;, Mem0 and Zep are the best choices. Clean APIs, good docs, production-proven.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're a person who wants an AI that remembers you&lt;/strong&gt;, only Alma and SuperMemory offer a real end-user experience. And Alma's 3-layer architecture (memories + episodes + procedures) + Soul Engine puts it in a different league for personalization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pricing comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Free tier&lt;/th&gt;
&lt;th&gt;Paid&lt;/th&gt;
&lt;th&gt;Best value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mem0&lt;/td&gt;
&lt;td&gt;10K memories&lt;/td&gt;
&lt;td&gt;$19 → $249/mo&lt;/td&gt;
&lt;td&gt;Standard ($19) if you don't need graphs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zep&lt;/td&gt;
&lt;td&gt;1K credits&lt;/td&gt;
&lt;td&gt;$25/mo&lt;/td&gt;
&lt;td&gt;Good for temporal use cases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Letta&lt;/td&gt;
&lt;td&gt;Self-hosted&lt;/td&gt;
&lt;td&gt;$20-200/mo&lt;/td&gt;
&lt;td&gt;Free if you manage infra&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SuperMemory&lt;/td&gt;
&lt;td&gt;1M tokens&lt;/td&gt;
&lt;td&gt;Usage-based&lt;/td&gt;
&lt;td&gt;Cheap for light use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alma&lt;/td&gt;
&lt;td&gt;500 memories&lt;/td&gt;
&lt;td&gt;$19-149/mo&lt;/td&gt;
&lt;td&gt;Pro ($19) covers most users&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  My recommendation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Building an AI app?&lt;/strong&gt; → Mem0 (most mature) or Zep (if you need temporal reasoning)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Want an AI that knows you?&lt;/strong&gt; → Alma (full product with memory as core UX)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Researching agent memory?&lt;/strong&gt; → Letta (most innovative architecture)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Need simple, cheap memory?&lt;/strong&gt; → SuperMemory (easiest to start)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Links: &lt;a href="https://mem0.ai" rel="noopener noreferrer"&gt;Mem0&lt;/a&gt; · &lt;a href="https://www.getzep.com" rel="noopener noreferrer"&gt;Zep&lt;/a&gt; · &lt;a href="https://letta.com" rel="noopener noreferrer"&gt;Letta&lt;/a&gt; · &lt;a href="https://supermemory.ai" rel="noopener noreferrer"&gt;SuperMemory&lt;/a&gt; · &lt;a href="https://alma.olivares.ai" rel="noopener noreferrer"&gt;Alma&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What's your experience with AI memory tools? Drop your setup in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>machinelearning</category>
      <category>automation</category>
    </item>
    <item>
      <title>Mem0 Is an API. I Built a Product. Here's Why That Distinction Matters.</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Mon, 30 Mar 2026 11:38:39 +0000</pubDate>
      <link>https://forem.com/fransys/mem0-is-an-api-i-built-a-product-heres-why-that-distinction-matters-48bk</link>
      <guid>https://forem.com/fransys/mem0-is-an-api-i-built-a-product-heres-why-that-distinction-matters-48bk</guid>
      <description>&lt;p&gt;There are now 8+ AI memory frameworks. Mem0, Zep, Letta, Hindsight, SuperMemory, LangMem — all solving the same problem: LLMs forget everything between conversations.&lt;/p&gt;

&lt;p&gt;I spent 6 months building Alma, and I made a fundamentally different bet than all of them.&lt;/p&gt;

&lt;p&gt;They built APIs. I built a product.&lt;/p&gt;

&lt;p&gt;Let me explain why I think that matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The API approach
&lt;/h2&gt;

&lt;p&gt;Mem0 is the market leader. $24M raised, 80K developers, AWS partnership. Their pitch: "Add memory to your AI app with a few lines of code."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mem0&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Memory&lt;/span&gt;
&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User prefers dark mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what does alice prefer?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is genuinely useful. If you're building a customer support bot or an AI agent, Mem0 gives you a memory layer you don't have to build yourself. The API is clean, the docs are good, and the managed service handles infrastructure.&lt;/p&gt;

&lt;p&gt;But here's the thing: &lt;strong&gt;Mem0's customer is a developer building an app. Not the person using the app.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The end user never sees Mem0. They don't configure it. They don't decide what gets remembered. They don't search their own memories. Mem0 is infrastructure — invisible by design.&lt;/p&gt;

&lt;h2&gt;
  
  
  The product approach
&lt;/h2&gt;

&lt;p&gt;Alma takes a different position. The user IS the customer.&lt;/p&gt;

&lt;p&gt;When you open Alma, you chat with an AI that remembers you. Not because a developer wired up memory API calls — but because the product is designed around persistent context as the core experience.&lt;/p&gt;

&lt;p&gt;You can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;See&lt;/strong&gt; what Alma remembers about you (and edit/delete anything)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configure&lt;/strong&gt; the AI's personality via Soul Engine (13 blocks: identity, tone, boundaries, knowledge...)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search&lt;/strong&gt; across months of memories by keyword or meaning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Separate&lt;/strong&gt; contexts into environments (work, personal, side project)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Export&lt;/strong&gt; everything in 6 formats (MD, HTML, PDF, DOCX, XLSX, JSON)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The memory isn't hidden infrastructure. It's the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this distinction matters
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Trust requires visibility
&lt;/h3&gt;

&lt;p&gt;When Mem0 stores a memory, the end user has no idea it happened. They can't see it, correct it, or delete it. This is fine for backend systems — but if you're building a personal AI assistant, users need to trust what's being remembered.&lt;/p&gt;

&lt;p&gt;Alma shows every memory with its confidence score (1.0 = user stated, 0.7 = AI inferred, 0.5 = observed), category, and last access date. You have full control.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Memory needs personality context
&lt;/h3&gt;

&lt;p&gt;A raw fact — "user prefers TypeScript" — means nothing without context. How should the AI use this information? Should it suggest TypeScript for every project? Only when the user asks for recommendations? Never assume, just know?&lt;/p&gt;

&lt;p&gt;Alma's Soul Engine solves this. You define not just what the AI knows, but how it behaves with that knowledge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;soul&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;identity&amp;gt;&lt;/span&gt;Senior developer who values clean architecture.&lt;span class="nt"&gt;&amp;lt;/identity&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;tone&amp;gt;&lt;/span&gt;Direct, concise. Code over explanations.&lt;span class="nt"&gt;&amp;lt;/tone&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;anti_patterns&amp;gt;&lt;/span&gt;Never suggest "any" type. Never use var.&lt;span class="nt"&gt;&amp;lt;/anti_patterns&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;knowledge&amp;gt;&lt;/span&gt;Working on a Next.js e-commerce app called ShopperPro.&lt;span class="nt"&gt;&amp;lt;/knowledge&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/soul&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Memory APIs don't have this. They store facts without behavioral context.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Three layers beat one
&lt;/h3&gt;

&lt;p&gt;Most memory APIs store flat key-value pairs or vector embeddings. Alma uses three distinct layers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What it stores&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memories&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Discrete facts with confidence scoring&lt;/td&gt;
&lt;td&gt;"Uses TypeScript" (confidence: 1.0)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Episodes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Conversation patterns detected automatically&lt;/td&gt;
&lt;td&gt;"User debugs auth issues on Mondays"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Procedures&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Learned workflows, reinforced by use&lt;/td&gt;
&lt;td&gt;"PR review = security → performance → tests"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Episodes and procedures are generated automatically by a background processor. The user doesn't have to manually "teach" the AI — it learns from patterns in your conversations.&lt;/p&gt;

&lt;p&gt;Mem0 recently added graph memory (entities + relationships), which is powerful for multi-entity tracking. But it's paywalled at $249/month and aimed at agent architectures, not personal use.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. The "AI that knows me" is a consumer product
&lt;/h3&gt;

&lt;p&gt;Today, 200M+ people use ChatGPT. Zero of them use Mem0 directly. The gap is obvious: people want AI that remembers them, but the existing products don't offer it.&lt;/p&gt;

&lt;p&gt;ChatGPT added "Memory" in 2024 — but it's a flat list of facts with no search, no organization, no confidence scoring, no personality system, and no way to separate work from personal context.&lt;/p&gt;

&lt;p&gt;Claude has no memory at all between conversations.&lt;/p&gt;

&lt;p&gt;The market for "personal AI with real memory" is massive and underserved. Mem0 serves developers building toward this. Alma serves users directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Mem0 wins
&lt;/h2&gt;

&lt;p&gt;Let me be fair. There are clear cases where Mem0 is the right choice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You're building a product&lt;/strong&gt; that needs memory as a feature (customer support bot, coding assistant, healthcare agent)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You need graph memory&lt;/strong&gt; for tracking entity relationships across users&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want AWS integration&lt;/strong&gt; (Mem0 is the exclusive memory provider for AWS Agent SDK)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You have a team&lt;/strong&gt; of developers who will manage the integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mem0 is great infrastructure. I'd probably use it if I were building a multi-tenant SaaS with AI features.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Alma wins
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You want an AI that remembers YOU&lt;/strong&gt; — not an API to add memory to your app&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want control&lt;/strong&gt; over what's remembered, with the ability to see, edit, search, and delete&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want personality&lt;/strong&gt; — not just memory, but behavioral context that shapes responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want a complete workspace&lt;/strong&gt; — chat, code, images, documents, voice, search — all with persistent context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You're a developer&lt;/strong&gt; who wants MCP/SDK integration AND a product to use daily&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The real comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Mem0&lt;/th&gt;
&lt;th&gt;Alma&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;What it is&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Memory API for developers&lt;/td&gt;
&lt;td&gt;AI product for users&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Customer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Developer building an app&lt;/td&gt;
&lt;td&gt;Person using AI daily&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;User sees memory?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (infrastructure)&lt;/td&gt;
&lt;td&gt;Yes (searchable, editable)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Personality system&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Soul Engine (13 blocks)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory layers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1 (vectors) or 2 (+ graph at $249/mo)&lt;/td&gt;
&lt;td&gt;3 (memories + episodes + procedures)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pricing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free → $19 → $249/mo&lt;/td&gt;
&lt;td&gt;Free → $19 → $49 → $149/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Self-contained product&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (requires your app)&lt;/td&gt;
&lt;td&gt;Yes (web app + extensions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP Server&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (npm install)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Languages&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;English&lt;/td&gt;
&lt;td&gt;15 languages&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Try both
&lt;/h2&gt;

&lt;p&gt;If you're a developer deciding between these, I'd genuinely suggest trying both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mem0&lt;/strong&gt; for adding memory to an app you're building: &lt;a href="https://mem0.ai" rel="noopener noreferrer"&gt;mem0.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alma&lt;/strong&gt; for an AI that remembers you personally: &lt;a href="https://alma.olivares.ai" rel="noopener noreferrer"&gt;alma.olivares.ai&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They solve different problems. The question is which problem you have.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm building Alma solo. If you have questions about the architecture, memory scoring, or anything else — drop a comment. I read every one.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>saas</category>
      <category>architecture</category>
      <category>productivity</category>
    </item>
    <item>
      <title>I Built a Full AI Platform with Persistent Memory — Here's What I Learned</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Wed, 25 Mar 2026 22:17:24 +0000</pubDate>
      <link>https://forem.com/fransys/i-built-a-full-ai-platform-with-persistent-memory-heres-what-i-learned-18lp</link>
      <guid>https://forem.com/fransys/i-built-a-full-ai-platform-with-persistent-memory-heres-what-i-learned-18lp</guid>
      <description>&lt;p&gt;Alma by Olivares.AI is a persistent memory layer for AI. It remembers facts, decisions, preferences, and behavioral patterns across every conversation. Think of it as giving your AI a long-term brain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's new:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Code Workspace&lt;/strong&gt; — Upload repos or clone from GitHub. 8 AI skills: explain, refactor, review, test, fix, document, search, commit. Resizable 3-panel layout with file explorer, editor, and AI chat. Choose between Claude Opus, Sonnet, or Haiku.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Video Studio&lt;/strong&gt; — Generate videos with Runway Gen-4 Turbo and Gen-4.5. Plan scenes with AI, manage projects with resource workspaces, generate YouTube metadata (title, description, tags), and stitch multiple clips into one video.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Global Conversation Search&lt;/strong&gt; — Search across all your conversations instantly. Find that decision you made weeks ago in seconds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conversation Branching&lt;/strong&gt; — Fork any conversation from a specific message. Explore alternative approaches without losing the original thread.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;10 Image Presets&lt;/strong&gt; — Cinematic, Anime, Watercolor, Pixel Art, Photography, 3D Render, Sketch, Neon, Minimalist. Each preset automatically enhances your prompt.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OCR&lt;/strong&gt; — Extract text from images using AI vision. Upload a photo of a document, whiteboard, or receipt.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Web Search with AI Summaries&lt;/strong&gt; — Perplexity-style search powered by Brave + Tavily with Claude-generated summaries and source citations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Trends &amp;amp; News&lt;/strong&gt; — Stay current with trending topics across 10 categories (tech, science, business, health, sports, entertainment, politics, world, environment, AI).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Stack:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloudflare Workers (Hono) + D1 + R2 + Vectorize + KV + Durable Objects&lt;/li&gt;
&lt;li&gt;React 19 + Vite 6 + Tailwind CSS 4&lt;/li&gt;
&lt;li&gt;35 API routes, 79 migrations, 2,600+ tests&lt;/li&gt;
&lt;li&gt;MCP Server, VSCode Extension, JavaScript SDK (all on npm)&lt;/li&gt;
&lt;li&gt;15 languages supported&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Try it:&lt;/strong&gt; &lt;a href="https://olivares.ai" rel="noopener noreferrer"&gt;alma.olivares.ai&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>saas</category>
    </item>
    <item>
      <title>How I Score, Rank, and Assemble AI Memory in Production</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Sun, 22 Mar 2026 12:29:10 +0000</pubDate>
      <link>https://forem.com/fransys/how-i-score-rank-and-assemble-ai-memory-in-production-17jh</link>
      <guid>https://forem.com/fransys/how-i-score-rank-and-assemble-ai-memory-in-production-17jh</guid>
      <description>&lt;p&gt;Every AI app eventually hits the same problem: the model needs context, but you can't dump everything into the system prompt. Token budgets are finite. Not all information is equally relevant. And the naive approach — "just send the last N messages" — falls apart the moment your user has 200 memories and a 4,000 token budget.&lt;/p&gt;

&lt;p&gt;I've been running a memory scoring and context assembly system in production for months. This is how it works, with actual code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pipeline
&lt;/h2&gt;

&lt;p&gt;The system has four stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Extract&lt;/strong&gt; structured memories from conversation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deduplicate&lt;/strong&gt; against existing memories&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Score&lt;/strong&gt; and rank by relevance, importance, recency, and frequency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Assemble&lt;/strong&gt; a token-budgeted system prompt&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each stage has specific engineering decisions that took a while to get right.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 1: Extraction
&lt;/h2&gt;

&lt;p&gt;After every 5 assistant messages, a background processor fires asynchronously via &lt;code&gt;ctx.waitUntil()&lt;/code&gt;. It takes the last 20 messages and asks the cheapest available model to extract structured data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ExtractedMemory&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;preference&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fact&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;decision&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;project&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;importance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 0.0 to 1.0&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ExtractedEpisode&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="nl"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The extraction prompt has specific rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="p"&gt;-&lt;/span&gt; Importance: 0.9+ for critical info, 0.5-0.8 for useful, below 0.5 for minor
&lt;span class="p"&gt;-&lt;/span&gt; Keep memories concise (one sentence each)
&lt;span class="p"&gt;-&lt;/span&gt; Extract 0-10 memories (only what's genuinely worth remembering)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The "0-10 memories" range matters. Early versions didn't cap extraction and the system generated noise — trivial facts diluting important ones. Capping at 10 per extraction cycle and requiring importance thresholds cleaned this up.&lt;/p&gt;

&lt;p&gt;The episode summary is also structured — not "you talked for 45 minutes" but &lt;code&gt;{ summary: "Debugged auth middleware", topics: ["authentication", "middleware"], outcome: "Root cause was missing await" }&lt;/code&gt;. This makes episodes searchable by topic without embedding the full transcript.&lt;/p&gt;

&lt;p&gt;One critical detail: this runs fire-and-forget. The user never waits. On Cloudflare Workers, that means every background promise needs both &lt;code&gt;ctx.waitUntil()&lt;/code&gt; AND &lt;code&gt;.catch()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;backgroundWork&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;conversationId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Background processing failed:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;backgroundWork&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Missing that &lt;code&gt;.catch()&lt;/code&gt; on Workers with compatibility dates 2024-10+ causes unhandled rejections that silently kill the Worker. This single line prevented a crash on every chat request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 2: Deduplication
&lt;/h2&gt;

&lt;p&gt;Without dedup, you get the same preference stored 30 times. "User prefers TypeScript" appearing in every extraction cycle.&lt;/p&gt;

&lt;p&gt;The approach: Jaccard similarity on extracted keywords with a 60% threshold and a 3-keyword minimum.&lt;/p&gt;

&lt;p&gt;Why 60%? Tested extensively:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;40%&lt;/strong&gt; merges distinct memories ("prefers TypeScript" conflates with "prefers functional patterns")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;80%&lt;/strong&gt; lets obvious duplicates through&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;60% with 3-keyword minimum&lt;/strong&gt; catches real duplicates while preserving distinct-but-related memories&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a duplicate is detected, the existing memory's &lt;code&gt;access_count&lt;/code&gt; increments. Frequently confirmed facts naturally rise in rankings without creating noise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 3: Scoring
&lt;/h2&gt;

&lt;p&gt;This is where it gets interesting. Every memory gets a composite score:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DEFAULT_WEIGHTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;relevance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// cosine similarity to current query&lt;/span&gt;
  &lt;span class="na"&gt;importance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// extracted weight (0-1)&lt;/span&gt;
  &lt;span class="na"&gt;recency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// exponential decay, 7-day half-life&lt;/span&gt;
  &lt;span class="na"&gt;frequency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// log-scaled access count&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The recency function uses exponential decay:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;recencyScore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accessedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;accessed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accessedAt&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;getTime&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hoursAgo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;accessed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;halfLifeHours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 7 days&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;LN2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;hoursAgo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;halfLifeHours&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A memory accessed today scores 1.0. One week ago: 0.5. Two weeks: 0.25. This means stale memories don't disappear — they just yield to fresher ones when the budget is tight.&lt;/p&gt;

&lt;p&gt;Frequency uses logarithmic scaling so high-access memories don't dominate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;frequencyScore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accessCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accessCount&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;accessCount&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why these weights?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Relevance at 0.40&lt;/strong&gt; because a perfectly scored memory about cooking is useless when you're debugging auth. Semantic relevance is the primary filter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Importance at 0.30&lt;/strong&gt; because not all memories are equal. "User is migrating to PostgreSQL this quarter" (0.9) should outrank "User mentioned coffee" (0.3), even if the coffee mention is more recent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recency at 0.20&lt;/strong&gt; because conversations have temporal context. What you discussed yesterday is more likely relevant than what you discussed a month ago — but not always.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frequency at 0.10&lt;/strong&gt; as a tiebreaker. Memories that keep surfacing in different conversations are probably important, but this shouldn't override direct relevance.&lt;/p&gt;

&lt;h3&gt;
  
  
  The confidence dimension
&lt;/h3&gt;

&lt;p&gt;Each memory also has a &lt;code&gt;confidence&lt;/code&gt; score that's separate from importance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1.0&lt;/strong&gt; — user explicitly stated this&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0.7&lt;/strong&gt; — AI inferred this from conversation context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Confidence feeds into retrieval quality. A high-confidence preference (the user said "I always use TypeScript") should surface over a high-importance but low-confidence inference ("probably prefers dark mode based on theme discussion").&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 4: Context Assembly
&lt;/h2&gt;

&lt;p&gt;The Context Assembler takes scored memories and builds a token-budgeted system prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;AssembledContext&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;soulTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;memoriesIncluded&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;memoriesTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;episodesIncluded&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;proceduresIncluded&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;totalTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;topMemoryScores&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The assembly order is strict:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Soul blocks first&lt;/strong&gt; (identity, style, context) — always included, non-negotiable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scored memories&lt;/strong&gt; — ranked, filling up to 50% of remaining token budget&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recent episodes&lt;/strong&gt; — latest conversation summaries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relevant procedures&lt;/strong&gt; — behavioral patterns matching the current query&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything is wrapped in XML sections for structured parsing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;alma_soul&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;identity&amp;gt;&lt;/span&gt;...&lt;span class="nt"&gt;&amp;lt;/identity&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;anti_patterns&amp;gt;&lt;/span&gt;...&lt;span class="nt"&gt;&amp;lt;/anti_patterns&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_soul&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;alma_memories&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;memory&lt;/span&gt; &lt;span class="na"&gt;importance=&lt;/span&gt;&lt;span class="s"&gt;"0.9"&lt;/span&gt; &lt;span class="na"&gt;category=&lt;/span&gt;&lt;span class="s"&gt;"project"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Migrating auth to PostgreSQL&lt;span class="nt"&gt;&amp;lt;/memory&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;memory&lt;/span&gt; &lt;span class="na"&gt;importance=&lt;/span&gt;&lt;span class="s"&gt;"0.7"&lt;/span&gt; &lt;span class="na"&gt;category=&lt;/span&gt;&lt;span class="s"&gt;"preference"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Prefers concise code reviews&lt;span class="nt"&gt;&amp;lt;/memory&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_memories&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;alma_episodes&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;episode&lt;/span&gt; &lt;span class="na"&gt;topics=&lt;/span&gt;&lt;span class="s"&gt;"auth,middleware"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Debugged auth middleware...&lt;span class="nt"&gt;&amp;lt;/episode&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_episodes&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;XML-safe truncation is critical — you never cut mid-tag. If a memory doesn't fit within the remaining budget, skip it entirely rather than corrupting the XML structure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why XML over JSON?
&lt;/h3&gt;

&lt;p&gt;Tested both. XML with labeled attributes gives the model clearer section boundaries. JSON works fine for structured data but the model is more likely to reference XML-tagged content naturally in responses. The &lt;code&gt;importance&lt;/code&gt; and &lt;code&gt;category&lt;/code&gt; attributes are visible to the model, which helps it prioritize.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I got wrong
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;First version had no scoring.&lt;/strong&gt; Just retrieved the N most recent memories. This breaks immediately — a critical project decision from last week gets buried under trivial facts from today.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second version over-weighted recency.&lt;/strong&gt; Everything decayed too fast. Important long-term preferences disappeared within two weeks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Third version didn't deduplicate.&lt;/strong&gt; After a month of use, the same preferences appeared 40+ times, eating token budget with redundant information.&lt;/p&gt;

&lt;p&gt;The current scoring weights are version four. They've been stable in production for months, but they're still configurable per user — different use cases might need different balances.&lt;/p&gt;

&lt;h2&gt;
  
  
  Numbers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Extraction latency: 0ms user-facing (background processing)&lt;/li&gt;
&lt;li&gt;Scoring: &amp;lt;5ms for 500 memories&lt;/li&gt;
&lt;li&gt;Context assembly: &amp;lt;10ms including soul prompt rendering&lt;/li&gt;
&lt;li&gt;D1 reads: 1-5ms, writes: 5-15ms&lt;/li&gt;
&lt;li&gt;Total overhead per message: near-zero for the user, ~2-4 seconds background&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system is Alma — &lt;a href="https://alma.olivares.ai" rel="noopener noreferrer"&gt;alma.olivares.ai&lt;/a&gt;. It wraps this pipeline in a web app, MCP server (21 tools for Claude Desktop/Cursor/Windsurf), VSCode extension, and REST API. Free tier available.&lt;/p&gt;

&lt;p&gt;But the scoring architecture applies to any AI system that needs to manage context at scale. The core insight: memory without ranking is just a pile of text. Ranking without token budgeting overflows the context window. Both without extraction means the user maintains everything by hand. You need all four stages.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>typescript</category>
      <category>architecture</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I built a memory system for AI — here's the architecture</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Sat, 14 Mar 2026 09:44:31 +0000</pubDate>
      <link>https://forem.com/fransys/i-built-a-memory-system-for-ai-heres-the-architecture-2fn9</link>
      <guid>https://forem.com/fransys/i-built-a-memory-system-for-ai-heres-the-architecture-2fn9</guid>
      <description>&lt;p&gt;If you use Claude Code or Claude Projects with a well-written &lt;code&gt;CLAUDE.md&lt;/code&gt;, you already know the difference it makes. The AI knows your stack, your conventions, your project structure. It's genuinely great.&lt;/p&gt;

&lt;p&gt;But &lt;code&gt;CLAUDE.md&lt;/code&gt; is static. You write it once, you maintain it manually, and it lives in one project. What about your preferences across projects? What about decisions you made three weeks ago? What about the patterns the AI could learn from watching how you work — if it had somewhere to store them?&lt;/p&gt;

&lt;p&gt;That's the gap I wanted to close. So I built &lt;a href="https://alma.olivares.ai" rel="noopener noreferrer"&gt;Alma&lt;/a&gt; — a cognitive memory system that gives AI persistent, structured memory that grows over time.&lt;/p&gt;

&lt;p&gt;Here's how it works under the hood.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture: 3 layers of memory
&lt;/h2&gt;

&lt;p&gt;Alma organizes memory in three distinct layers, each serving a different purpose.&lt;/p&gt;

&lt;h3&gt;
  
  
  Memories — what the AI knows about you
&lt;/h3&gt;

&lt;p&gt;Structured facts. Each one is semantically indexed, categorized, scored by importance, and tagged with confidence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Prefers TypeScript over JavaScript"          [confidence: 1.0, category: preference]
"Project uses D1 with Drizzle ORM"            [confidence: 1.0, category: technical]
"Hates verbose explanations — get to the point" [confidence: 0.8, category: preference]
"Decided on event-driven architecture March 3" [confidence: 1.0, category: decision]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you say "review my auth middleware", Alma's &lt;strong&gt;Context Assembler&lt;/strong&gt; runs a hybrid search (keyword + semantic) across your memories. It pulls the ones relevant to auth, to your stack, to your coding preferences — and injects them into the system prompt before the LLM even sees your message.&lt;/p&gt;

&lt;p&gt;The result: the AI already has context before you type your first message.&lt;/p&gt;

&lt;h3&gt;
  
  
  Episodes — what happened before
&lt;/h3&gt;

&lt;p&gt;After each conversation, a background processor generates a structured summary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Episode: "Auth Middleware Refactor"
  Summary: Rewrote JWT validation to use jose library.
           Added refresh token rotation. Decided against
           session cookies for API-first architecture.
  Topics: auth, security, middleware
  Outcome: PR merged, deployed to staging
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you say "remember that auth discussion?", the AI recalls the full episode — decisions, outcomes, context. Structured summaries, not raw transcript fragments.&lt;/p&gt;

&lt;h3&gt;
  
  
  Procedures — how you like to work
&lt;/h3&gt;

&lt;p&gt;Procedures are behavioral patterns the AI learns from observing your interactions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"When reviewing code → check error handling first, then types"
"When explaining → use bullet points, not paragraphs"
"When debugging → ask for the error message before suggesting fixes"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These aren't stored and forgotten. They're matched against context on every conversation and applied dynamically. After a few weeks, the AI starts anticipating how you want things done — without you ever explicitly configuring it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Soul Engine: identity, not a system prompt
&lt;/h2&gt;

&lt;p&gt;The Soul Engine goes beyond a single system prompt. It's 12 structured blocks organized in three sections:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SOUL&lt;/strong&gt; — who the AI is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Identity&lt;/strong&gt;: core character, name, role&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worldview&lt;/strong&gt;: how it approaches problems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rules&lt;/strong&gt;: non-negotiable behaviors (never fabricate memories, acknowledge uncertainty)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tensions&lt;/strong&gt;: the paradoxes that make personality feel real ("technical but warm", "concise but thorough when it matters")&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;STYLE&lt;/strong&gt; — how it communicates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Style Guide&lt;/strong&gt;: voice, vocabulary, structure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-Patterns&lt;/strong&gt;: things to never do ("never say 'As an AI language model'")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Communication Modes&lt;/strong&gt;: different modes for different situations (teaching, debugging, creative)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example Interactions&lt;/strong&gt;: calibration by demonstration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;CONTEXT&lt;/strong&gt; — what it knows right now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;User Profile&lt;/strong&gt;, &lt;strong&gt;Active Context&lt;/strong&gt;, &lt;strong&gt;Learned Patterns&lt;/strong&gt;, &lt;strong&gt;Scratchpad&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Plus custom blocks you define yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every conversation, the &lt;strong&gt;Context Assembler&lt;/strong&gt; renders this into a structured XML system prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;alma_soul&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;identity&amp;gt;&lt;/span&gt;You are Alma. Direct, technical, warm...&lt;span class="nt"&gt;&amp;lt;/identity&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;worldview&amp;gt;&lt;/span&gt;Simplicity over cleverness. Working code over elegant abstractions.&lt;span class="nt"&gt;&amp;lt;/worldview&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;tensions&amp;gt;&lt;/span&gt;Technical but approachable. Opinionated but open to correction.&lt;span class="nt"&gt;&amp;lt;/tensions&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;rules&amp;gt;&lt;/span&gt;Always reference relevant memories. Never fabricate information.&lt;span class="nt"&gt;&amp;lt;/rules&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_soul&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;alma_context&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;user_profile&amp;gt;&lt;/span&gt;Senior dev, TypeScript, Hono + D1 stack...&lt;span class="nt"&gt;&amp;lt;/user_profile&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;active_context&amp;gt;&lt;/span&gt;Working on auth middleware refactor...&lt;span class="nt"&gt;&amp;lt;/active_context&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;memories&amp;gt;&lt;/span&gt;
    [12 most relevant memories for this conversation, ranked by semantic score]
  &lt;span class="nt"&gt;&amp;lt;/memories&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;episodes&amp;gt;&lt;/span&gt;
    [3 recent relevant episodes with summaries and outcomes]
  &lt;span class="nt"&gt;&amp;lt;/episodes&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;procedures&amp;gt;&lt;/span&gt;
    [Matched behavioral patterns for code review context]
  &lt;span class="nt"&gt;&amp;lt;/procedures&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_context&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Priority order: soul blocks first (always), then memories, then episodes, then procedures — all fit within a token budget. The AI gets a complete picture of who you are and what's happening, every single time.&lt;/p&gt;

&lt;h2&gt;
  
  
  It learns while you chat
&lt;/h2&gt;

&lt;p&gt;Memory extraction runs in the background. You never wait for it.&lt;/p&gt;

&lt;p&gt;After a conversation, a background processor (cheapest LLM, fire-and-forget with &lt;code&gt;ctx.waitUntil()&lt;/code&gt;) analyzes the exchange and:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extracts new memories&lt;/li&gt;
&lt;li&gt;Generates episode summaries&lt;/li&gt;
&lt;li&gt;Updates your user profile and active context&lt;/li&gt;
&lt;li&gt;Refines procedures from observed patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You just chat. The AI gets quietly better after every interaction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Developer-first: everything is an API
&lt;/h2&gt;

&lt;h3&gt;
  
  
  REST API — 140+ endpoints
&lt;/h3&gt;

&lt;p&gt;Full CRUD on everything. Memories, episodes, procedures, blocks, conversations, chat (SSE streaming), files, images, voice, teams.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Assemble full context for any message&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://alma.olivares.ai/api/v1/context/assemble &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-Key: alma_key_..."&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"user_message": "Review the auth middleware"}'&lt;/span&gt;

&lt;span class="c"&gt;# Returns: structured system prompt + metadata (token counts, memory scores, keywords)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  MCP Server — 21 tools for Claude Desktop, Cursor, Windsurf
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"alma"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@olivaresai/alma-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ALMA_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"alma_key_..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your AI gets native tools: &lt;code&gt;alma_search&lt;/code&gt;, &lt;code&gt;alma_remember&lt;/code&gt;, &lt;code&gt;alma_recall&lt;/code&gt;, &lt;code&gt;alma_assemble&lt;/code&gt;, &lt;code&gt;alma_focus&lt;/code&gt;, &lt;code&gt;alma_update_block&lt;/code&gt; — it reads and writes to its own memory as part of reasoning.&lt;/p&gt;

&lt;h3&gt;
  
  
  JavaScript SDK
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @olivaresai/alma-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Alma&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@olivaresai/alma-sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;alma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Alma&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;alma_key_...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;alma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assemble&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Review the auth middleware&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// → Full system prompt with soul, memories, episodes, procedures&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  VSCode Extension
&lt;/h3&gt;

&lt;p&gt;Memory search from the command palette. Context injection. Chat with persistent memory without leaving your editor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Voice, images, documents — same memory
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Voice Chat&lt;/strong&gt;: Deepgram Nova-2 (transcription) + ElevenLabs (synthesis). Talk to your AI by voice — same persistent memory as text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image Studio&lt;/strong&gt;: Flux Pro + Leonardo AI. The AI remembers your style preferences and past generations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Generation&lt;/strong&gt;: Export conversations to PDF, DOCX, XLSX, PPTX.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every modality shares the same memory layer. A voice conversation references decisions from a text chat two weeks ago.&lt;/p&gt;

&lt;h2&gt;
  
  
  3 models, your choice
&lt;/h2&gt;

&lt;p&gt;Powered exclusively by Anthropic Claude:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Normal&lt;/td&gt;
&lt;td&gt;Claude Haiku&lt;/td&gt;
&lt;td&gt;Quick tasks, everyday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Advanced&lt;/td&gt;
&lt;td&gt;Claude Sonnet&lt;/td&gt;
&lt;td&gt;Professional work, complex analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex&lt;/td&gt;
&lt;td&gt;Claude Opus&lt;/td&gt;
&lt;td&gt;Deep reasoning, nuanced problems&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Free plan gets Haiku. Paid plans get all three. Switch anytime — memory carries over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BYOK&lt;/strong&gt;: On Advanced+ plans, bring your own Anthropic, Replicate, or Leonardo API keys. Queries go direct to your accounts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Privacy
&lt;/h2&gt;

&lt;p&gt;Your memories, episodes, procedures, and identity blocks are the most personal data an AI can hold. Alma's position:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You own everything. Full &lt;code&gt;.alma&lt;/code&gt; portable export. GDPR compliant (Articles 15-22).&lt;/li&gt;
&lt;li&gt;Never used for training. Zero tracking. Zero analytics.&lt;/li&gt;
&lt;li&gt;Account deletion permanently purges databases, R2 storage, and Stripe records. No retention.&lt;/li&gt;
&lt;li&gt;Encrypted at rest and in transit. API keys hashed and never exposed after creation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plan&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;th&gt;Highlights&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;$0 forever&lt;/td&gt;
&lt;td&gt;500 memories, 50 episodes, Claude Haiku&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pro&lt;/td&gt;
&lt;td&gt;$19/mo&lt;/td&gt;
&lt;td&gt;10K memories, 3 AI tiers, voice, images&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Advanced&lt;/td&gt;
&lt;td&gt;$49/mo&lt;/td&gt;
&lt;td&gt;50K memories, API + MCP access, BYOK&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ultimate&lt;/td&gt;
&lt;td&gt;$149/mo&lt;/td&gt;
&lt;td&gt;Unlimited everything, dedicated support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ultimate Max&lt;/td&gt;
&lt;td&gt;$249/mo&lt;/td&gt;
&lt;td&gt;2x weekly AI budget, maximum capacity&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Weekly AI budget resets each Monday. Credit packs ($14.99 / $39.99 / $89.99) never expire.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;p&gt;If you're curious: the entire system runs on &lt;strong&gt;Cloudflare Workers&lt;/strong&gt; (D1 for SQL, Vectorize for embeddings, R2 for files), &lt;strong&gt;Hono&lt;/strong&gt; for the API framework, &lt;strong&gt;React&lt;/strong&gt; for the frontend, and &lt;strong&gt;Anthropic Claude&lt;/strong&gt; for all AI inference. 56 database migrations, ~1,600 tests passing. Solo developer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Web App&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://alma.olivares.ai" rel="noopener noreferrer"&gt;alma.olivares.ai&lt;/a&gt; — free, no credit card&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP Server&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.npmjs.com/package/@olivaresai/alma-mcp" rel="noopener noreferrer"&gt;@olivaresai/alma-mcp&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;VSCode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://marketplace.visualstudio.com/items?itemName=olivares.alma-vscode" rel="noopener noreferrer"&gt;VS Code Marketplace&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JS SDK&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.npmjs.com/package/@olivaresai/alma-sdk" rel="noopener noreferrer"&gt;@olivaresai/alma-sdk&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;REST API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://olivares.ai/developers" rel="noopener noreferrer"&gt;Developer Docs&lt;/a&gt; — 140+ endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Docs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://olivares.ai/docs/overview" rel="noopener noreferrer"&gt;olivares.ai/docs&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The free tier has 500 memories and no time limit. If you've ever been frustrated by an AI that forgets everything, give it a few conversations. The difference is immediate.&lt;/p&gt;

&lt;p&gt;What would you want an AI that actually remembers you to do? I'd genuinely like to know.&lt;/p&gt;




&lt;p&gt;*&lt;a href="https://olivares.ai" rel="noopener noreferrer"&gt;OlivaresAI&lt;/a&gt;.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F24iuo33dbrxyjei9x7bz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F24iuo33dbrxyjei9x7bz.png" alt=" " width="800" height="361"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>devops</category>
      <category>productivity</category>
    </item>
    <item>
      <title>I replaced my claude.md with a 3-layer cognitive memory system. Here's the architecture.</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Thu, 12 Mar 2026 07:51:43 +0000</pubDate>
      <link>https://forem.com/fransys/i-replaced-my-claudemd-with-a-3-layer-cognitive-memory-system-heres-the-architecture-15l8</link>
      <guid>https://forem.com/fransys/i-replaced-my-claudemd-with-a-3-layer-cognitive-memory-system-heres-the-architecture-15l8</guid>
      <description>&lt;p&gt;I built a structured memory system for AI called Alma. This post explains the architecture, not the marketing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem, technically
&lt;/h2&gt;

&lt;p&gt;Current AI memory implementations (claude.md, .cursorrules, ChatGPT Memory) share these limitations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No schema.&lt;/strong&gt; All data is unstructured text. No types, no fields, no queryable metadata.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No weighting.&lt;/strong&gt; Every piece of information has equal priority in the context window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No automatic extraction.&lt;/strong&gt; The user manually maintains the memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No deduplication.&lt;/strong&gt; Similar information accumulates without merging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No separation of concerns.&lt;/strong&gt; Identity, style preferences, and session context are mixed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fao6dej0dp8scd2as0rp7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fao6dej0dp8scd2as0rp7.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Focb7ql81ltwhfd40lku2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Focb7ql81ltwhfd40lku2.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture
&lt;/h2&gt;

&lt;p&gt;Alma has three data layers and an assembly engine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────┐
│                Context Assembler             │
│  (dynamic token budget, relevance scoring)   │
├──────────┬──────────┬──────────┬────────────┤
│ Soul     │ Memories │ Episodes │ Procedures │
│ Engine   │          │          │            │
│ 13 blocks│ Weighted │ Summaries│ Behavioral │
│ Identity │ facts    │ w/ topics│ patterns   │
│ Style    │ w/ score │ outcomes │ auto-       │
│ Context  │ category │ search   │ extracted  │
└──────────┴──────────┴──────────┴────────────┘
         ↑ Background Processor ↑
         (async, every N messages)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 1: Memories
&lt;/h3&gt;

&lt;p&gt;Schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Memory&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;preference&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fact&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;decision&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;project&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;general&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;importance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;// 0-1, determines context priority&lt;/span&gt;
  &lt;span class="nl"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;manual&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;extracted&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;extension&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;consolidated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;access_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// incremented on retrieval&lt;/span&gt;
  &lt;span class="nl"&gt;reinforcement_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// incremented on dedup match&lt;/span&gt;
  &lt;span class="nl"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Float32Array&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;// for semantic search&lt;/span&gt;
  &lt;span class="nl"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;last_accessed_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deduplication uses Jaccard similarity on keyword sets with a 60% threshold and 3-keyword minimum. Above threshold: reinforce existing memory (increment count) instead of creating new record.&lt;/p&gt;

&lt;p&gt;Search is hybrid: keyword (SQL FTS5) + semantic (cosine similarity on Cloudflare Vectorize embeddings). Results merged and re-ranked by a weighted score:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;WEIGHTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;relevance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// Cosine similarity to current query&lt;/span&gt;
  &lt;span class="na"&gt;importance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// 0.0-1.0, extracted or user-assigned&lt;/span&gt;
  &lt;span class="na"&gt;recency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;// Exponential decay, 7-day half-life&lt;/span&gt;
  &lt;span class="na"&gt;frequency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;// Logarithmic scale of access count&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 2: Episodes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Episode&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;conversation_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;topics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="nl"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;message_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Float32Array&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Auto-generated at conversation end. Searchable by topic, outcome, or semantic similarity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Procedures
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Procedure&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// "Checks error handling first in code reviews"&lt;/span&gt;
  &lt;span class="nl"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;trigger&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// When this pattern activates&lt;/span&gt;
  &lt;span class="nl"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;extracted&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;manual&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Extracted by the background processor analyzing conversation patterns. These represent behavioral habits, not explicit preferences.&lt;/p&gt;

&lt;h3&gt;
  
  
  Soul Engine: 13 blocks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;SoulSection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;identity&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;style&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;context&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;BlockKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;identity&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;worldview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tensions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;rules&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;style_guide&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anti_patterns&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;communication&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;examples&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user_profile&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;active_context&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;learned_patterns&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scratchpad&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;custom&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;SoulBlock&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BlockKey&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;section&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SoulSection&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;char_limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;truncation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;head&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tail&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// head = keep newest, tail = keep oldest&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Identity blocks use &lt;code&gt;tail&lt;/code&gt; truncation (preserve oldest = core values stable). Context blocks use &lt;code&gt;head&lt;/code&gt; truncation (trim oldest = keep fresh data). This simple mechanism creates different temporal behaviors without complex logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Context Assembler
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;assembleContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Soul Engine — always included, highest priority&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;soul&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;renderSoulBlocks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Relevant memories — scored by semantic similarity to current message&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;memories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;searchMemories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hybrid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Recent episodes — for conversation continuity&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;episodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getRecentEpisodes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 4. Matching procedures — behavioral patterns&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;procedures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;matchProcedures&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 5. Dynamic token budget — sections compete for space&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;buildPrompt&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;soul&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;episodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;procedures&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;TOKEN_BUDGET&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each section has a priority. If total tokens exceed the budget, lower-priority sections get truncated first. The Soul Engine is always preserved in full.&lt;/p&gt;

&lt;h3&gt;
  
  
  Background Processor
&lt;/h3&gt;

&lt;p&gt;Fires asynchronously via &lt;code&gt;ctx.waitUntil()&lt;/code&gt; every N messages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Sends recent conversation to Claude Haiku for analysis&lt;/li&gt;
&lt;li&gt;Receives structured JSON with extracted memories, episodes, procedures&lt;/li&gt;
&lt;li&gt;Deduplicates memories against existing store&lt;/li&gt;
&lt;li&gt;Updates relevant soul blocks (active_context, learned_patterns, user_profile)&lt;/li&gt;
&lt;li&gt;Stores episode summary&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Zero impact on conversation latency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Infrastructure
&lt;/h2&gt;

&lt;p&gt;Entirely Cloudflare:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workers&lt;/strong&gt; — API, SSE streaming, background processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;D1&lt;/strong&gt; — SQLite database (56 migrations)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vectorize&lt;/strong&gt; — Embedding storage and similarity search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;R2&lt;/strong&gt; — File uploads (images, documents)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KV&lt;/strong&gt; — Configuration cache&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durable Objects&lt;/strong&gt; — Atomic budget tracking (single-threaded counters)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No AWS. No external databases. Cold start under 5ms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Numbers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;1,690 passing tests across 102 files&lt;/li&gt;
&lt;li&gt;56 database migrations&lt;/li&gt;
&lt;li&gt;180 REST API endpoints&lt;/li&gt;
&lt;li&gt;15 fully localized languages&lt;/li&gt;
&lt;li&gt;6 agent tools in chat + 21 MCP tools + 9 MCP resources&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Web app&lt;/strong&gt;: &lt;a href="https://olivares.ai" rel="noopener noreferrer"&gt;alma.olivares.ai&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Free tier: 500 memories, Claude Haiku, automatic learning. No credit card.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Francisco @ Olivares.AI&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>showdev</category>
      <category>typescript</category>
      <category>architecture</category>
    </item>
    <item>
      <title>The Soul Engine: 13 blocks that replaced my 200-line system prompt</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Wed, 11 Mar 2026 12:12:35 +0000</pubDate>
      <link>https://forem.com/fransys/the-soul-engine-13-blocks-that-replaced-my-200-line-system-prompt-2o0h</link>
      <guid>https://forem.com/fransys/the-soul-engine-13-blocks-that-replaced-my-200-line-system-prompt-2o0h</guid>
      <description>&lt;p&gt;I used Claude with a 200+ line system prompt for months. Every convention, every preference, every project decision — crammed into a single text document. It worked. Barely.&lt;/p&gt;

&lt;p&gt;Three problems kept growing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. No priority.&lt;/strong&gt; "Be concise" and "never fabricate data" had equal weight. One is a style preference. The other is a critical rule.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Static by design.&lt;/strong&gt; I corrected the same behavior ten times — "don't add comments to obvious code" — and it never stuck because the prompt didn't learn.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Mixed concerns.&lt;/strong&gt; "You are thoughtful and direct" and "I'm working on the auth module this week" are fundamentally different types of information with different lifespans.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Soul Engine
&lt;/h3&gt;

&lt;p&gt;I built a replacement. 13 blocks organized into three sections, each with a different purpose and rate of change:&lt;/p&gt;

&lt;h3&gt;
  
  
  Section 1: alma_soul — WHO the AI is
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;alma_soul&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;identity&amp;gt;&lt;/span&gt;Core traits. Non-negotiable.&lt;span class="nt"&gt;&amp;lt;/identity&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;worldview&amp;gt;&lt;/span&gt;Beliefs, principles, decision framework.&lt;span class="nt"&gt;&amp;lt;/worldview&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;tensions&amp;gt;&lt;/span&gt;Creative paradoxes: "technical but warm",
    "concise but thorough when needed"&lt;span class="nt"&gt;&amp;lt;/tensions&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;rules&amp;gt;&lt;/span&gt;Behavioral rules. Always followed.&lt;span class="nt"&gt;&amp;lt;/rules&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_soul&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;tensions&lt;/code&gt; block is worth highlighting. Instead of flat rules, you define paradoxes: "opinionated about code quality, flexible about everything else." This produces more nuanced responses than a list of dos and don'ts. The AI gets permission to be complex.&lt;/p&gt;

&lt;p&gt;These blocks are &lt;strong&gt;stable&lt;/strong&gt; — you define them once and they rarely change.&lt;/p&gt;

&lt;h3&gt;
  
  
  Section 2: alma_style — HOW it communicates
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;alma_style&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;anti_patterns&amp;gt;&lt;/span&gt;Things to NEVER do.&lt;span class="nt"&gt;&amp;lt;/anti_patterns&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;style_guide&amp;gt;&lt;/span&gt;Voice, vocabulary, formatting.&lt;span class="nt"&gt;&amp;lt;/style_guide&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;communication_modes&amp;gt;&lt;/span&gt;
    "Debug mode: ask 2-3 questions first"
    "Code review: be direct, say 'change X to Y'"
  &lt;span class="nt"&gt;&amp;lt;/communication_modes&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;example_interactions&amp;gt;&lt;/span&gt;Calibration samples.&lt;span class="nt"&gt;&amp;lt;/example_interactions&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_style&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;anti_patterns&lt;/code&gt; block was the single most impactful change. Five lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Never start with "Great question!" or "That's interesting!"
Never hedge facts with "I think" or "I believe"
Never add comments to code unless logic is non-obvious
If response starts with an apology, rewrite without it
Never list more than 5 bullet points — synthesize instead
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this works: Claude is trained on "helpful assistant" patterns. Suppressing specific unwanted patterns is a clearer signal than vague aspirational guidelines. The model knows exactly what &lt;em&gt;not&lt;/em&gt; to do.&lt;/p&gt;

&lt;p&gt;These blocks &lt;strong&gt;evolve slowly&lt;/strong&gt; as you refine your preferences.&lt;/p&gt;

&lt;h3&gt;
  
  
  Section 3: alma_context — WHAT it knows about you
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;alma_context&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;user_profile&amp;gt;&lt;/span&gt;Facts about you. Auto-updated.&lt;span class="nt"&gt;&amp;lt;/user_profile&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;active_context&amp;gt;&lt;/span&gt;Current projects, focus areas.&lt;span class="nt"&gt;&amp;lt;/active_context&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;learned_patterns&amp;gt;&lt;/span&gt;Patterns discovered from your behavior.&lt;span class="nt"&gt;&amp;lt;/learned_patterns&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;scratchpad&amp;gt;&lt;/span&gt;Working memory for current conversation.&lt;span class="nt"&gt;&amp;lt;/scratchpad&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/alma_context&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This section &lt;strong&gt;updates itself&lt;/strong&gt;. A background processor fires every few messages, analyzes the conversation with a lightweight model, and updates your profile, context, and patterns. No manual maintenance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Context Assembly
&lt;/h3&gt;

&lt;p&gt;Having 13 blocks is meaningless if they blow the context window. The assembler manages a strict token budget:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Soul blocks&lt;/strong&gt; always fit — highest priority, never truncated&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ranked memories&lt;/strong&gt; fill up to 50% of remaining budget, scored by:

&lt;ul&gt;
&lt;li&gt;Relevance to current topic: 40%&lt;/li&gt;
&lt;li&gt;Importance: 30%&lt;/li&gt;
&lt;li&gt;Recency (7-day half-life exponential decay): 20%&lt;/li&gt;
&lt;li&gt;Access frequency (log scale): 10%&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Episode summaries&lt;/strong&gt; and &lt;strong&gt;procedures&lt;/strong&gt; fill the remainder&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XML-safe truncation&lt;/strong&gt; — never cuts mid-tag&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If budget runs out, context sections drop first, then style. Soul blocks stay intact.&lt;/p&gt;

&lt;h3&gt;
  
  
  What changes over time
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Week 1:&lt;/strong&gt; Feels normal. The system silently builds context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 2:&lt;/strong&gt; The AI stops asking "what language do you use?" Code matches your conventions. Past decisions get referenced naturally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Month 1:&lt;/strong&gt; Cross-conversation connections. "This is similar to the approach you decided against for the auth module." That's when it shifts from tool to collaborator.&lt;/p&gt;

&lt;h3&gt;
  
  
  Try it
&lt;/h3&gt;

&lt;p&gt;This is the core of Alma (&lt;a href="//olivares.ai"&gt;alma.olivares.ai&lt;/a&gt;). The Soul Engine is fully available on the free tier — all 13 blocks, fully editable. The value appears around day 14 when enough context has accumulated.&lt;/p&gt;

&lt;p&gt;What would your 13 blocks look like?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>saas</category>
      <category>typescript</category>
    </item>
    <item>
      <title>usulnet v26.2.7 — open-source Docker infrastructure platform</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Mon, 23 Feb 2026 20:32:27 +0000</pubDate>
      <link>https://forem.com/fransys/usulnet-v2627-open-source-docker-infrastructure-platform-12pp</link>
      <guid>https://forem.com/fransys/usulnet-v2627-open-source-docker-infrastructure-platform-12pp</guid>
      <description>&lt;p&gt;usulnet is an open-source, self-hosted Docker infrastructure platform. One binary, one web UI — containers, security, backups, reverse proxy, DNS, VPN, monitoring, terminal, file browser, multi-node orchestration. No vendor lock-in, no telemetry, no cloud dependency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/fr4nsys/usulnet" rel="noopener noreferrer"&gt;github.com/fr4nsys/usulnet&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Website&lt;/strong&gt;: &lt;a href="https://usulnet.com" rel="noopener noreferrer"&gt;usulnet.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;v26.2.7 is the biggest release yet: 11 new features, 17 bug fixes (several critical), and a complete proxy simplification.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqozpmpk1s1ov7nw72n2s.png" alt=" " width="800" height="400"&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  What's New in v26.2.7
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Embedded DNS Server
&lt;/h3&gt;

&lt;p&gt;Full authoritative DNS server built into usulnet, powered by &lt;a href="https://github.com/miekg/dns" rel="noopener noreferrer"&gt;miekg/dns&lt;/a&gt; (the Go library behind CoreDNS). Runs in-process — no external DNS software to install or manage.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zone management&lt;/strong&gt; — Create primary, secondary, and forward zones with full SOA configuration. Serial auto-increments on every record change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;10 record types&lt;/strong&gt; — A, AAAA, CNAME, MX, TXT, NS, SRV, PTR, CAA, SOA. Per-record TTL and enable/disable toggle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TSIG keys&lt;/strong&gt; — Transaction Signature keys for secure zone transfers. Secrets encrypted at rest with AES-256-GCM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Upstream forwarding&lt;/strong&gt; — Non-authoritative queries forwarded to configurable upstreams (default: Cloudflare 1.1.1.3 + 1.0.0.3 malware-blocking DNS).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live statistics&lt;/strong&gt; — Real-time query counters, zones loaded, server uptime, health check.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit logging&lt;/strong&gt; — Every zone/record/key change logged with user, action, resource, and timestamp.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8 new UI pages&lt;/strong&gt; — Zone list, create/edit, detail with inline record management, DNS settings, audit log.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  DNS Service Discovery
&lt;/h3&gt;

&lt;p&gt;Running Docker containers are automatically registered as DNS records — no manual configuration.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A records&lt;/strong&gt;: &lt;code&gt;redis.containers.local&lt;/code&gt; → container IP. Registered on &lt;code&gt;container start&lt;/code&gt;, removed on &lt;code&gt;container stop/die&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SRV records&lt;/strong&gt;: Exposed ports get &lt;code&gt;_8080._tcp.myapp.containers.local&lt;/code&gt; for service discovery by name and port.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time&lt;/strong&gt;: Docker event stream callbacks — instant registration/deregistration, no polling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reconciliation&lt;/strong&gt;: Periodic full-state sync catches events missed during transient Docker API disconnects.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;dns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;listen_addr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:53"&lt;/span&gt;
  &lt;span class="na"&gt;service_discovery&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;domain&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;containers.local"&lt;/span&gt;
    &lt;span class="na"&gt;create_srv&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  WireGuard VPN Management
&lt;/h3&gt;

&lt;p&gt;Native WireGuard VPN from the web UI. No CLI, no config file editing.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create and manage multiple WireGuard interfaces per host&lt;/li&gt;
&lt;li&gt;Add peers with auto-generated Curve25519 keys and preshared keys&lt;/li&gt;
&lt;li&gt;Client config generation (copy-paste or QR code)&lt;/li&gt;
&lt;li&gt;Transfer statistics (rx/tx) per interface and per peer&lt;/li&gt;
&lt;li&gt;Post-up/post-down script support for routing rules&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Firewall Manager
&lt;/h3&gt;

&lt;p&gt;Visual iptables/nftables management — create, edit, apply, and sync firewall rules from the browser.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chains&lt;/strong&gt;: INPUT, OUTPUT, FORWARD, DOCKER-USER&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protocols&lt;/strong&gt;: TCP, UDP, ICMP, ALL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Actions&lt;/strong&gt;: ACCEPT, DROP, REJECT, LOG&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit log&lt;/strong&gt;: Every rule change recorded with user, action, timestamp, and rule details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-detection&lt;/strong&gt;: Detects whether the host uses iptables or nftables and applies through the correct backend&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-click sync&lt;/strong&gt;: Apply individual rules or sync the entire ruleset to the host&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  SSL Observatory
&lt;/h3&gt;

&lt;p&gt;SSL Labs-style TLS scanner for monitoring certificate health across your infrastructure.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Certificate scanning&lt;/strong&gt;: Analyzes protocol versions (TLS 1.0–1.3), cipher suites, certificate chains, OCSP stapling, HSTS, and Certificate Transparency logs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grading&lt;/strong&gt;: A+ to F letter grades with 0–100 numeric scoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard&lt;/strong&gt;: Grade distribution chart and expiring certificate alerts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detailed reports&lt;/strong&gt;: Per-target breakdown with actionable remediation guidance&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Backup Verification
&lt;/h3&gt;

&lt;p&gt;Automated backup integrity verification — proving backups are actually restorable, not just present.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Three methods&lt;/strong&gt;: Extract (unpack and validate), Container (mount and verify), Database (restore to temp instance and query)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrity checks&lt;/strong&gt;: Checksums, file readability, container accessibility, data integrity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schedulable&lt;/strong&gt;: Cron expressions for recurring automated verification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;History&lt;/strong&gt;: Full run log with status, method, duration, and error details&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Container Image Builder
&lt;/h3&gt;

&lt;p&gt;Build Docker images from Dockerfiles directly in the web UI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-stage build support&lt;/li&gt;
&lt;li&gt;Build arguments and platform targeting&lt;/li&gt;
&lt;li&gt;Reusable Dockerfile templates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Automated Rollback
&lt;/h3&gt;

&lt;p&gt;Automatic stack rollback when deployments fail or health checks break.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configurable rollback policies&lt;/li&gt;
&lt;li&gt;Retry limits and cooldown periods&lt;/li&gt;
&lt;li&gt;Full execution history&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Crontab Manager
&lt;/h3&gt;

&lt;p&gt;Web-based cron job scheduling — create, edit, enable/disable, and execute jobs from the UI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Three command types&lt;/strong&gt;: Shell commands (with working directory), Docker exec (target container), HTTP webhooks (GET/POST/PUT/DELETE)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cron scheduling&lt;/strong&gt;: Standard 5-field expressions via robfig/cron/v3&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution history&lt;/strong&gt;: Every run recorded — status, stdout/stderr, exit code, duration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run Now&lt;/strong&gt;: Execute any job immediately, independent of schedule&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-cleanup&lt;/strong&gt;: Records older than 30 days pruned automatically&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Interactive Network Topology Graph
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;/topology&lt;/code&gt; page upgraded from static cards to an interactive D3.js force-directed graph.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Force-directed layout&lt;/strong&gt;: Networks as rectangles, containers as circles, physics-based positioning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drag &amp;amp; drop&lt;/strong&gt;: Rearrange nodes, pin in place&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zoom &amp;amp; pan&lt;/strong&gt;: Mouse wheel and drag, reset button&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hover highlighting&lt;/strong&gt;: Hovering a node highlights connections, dims everything else&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Click details&lt;/strong&gt;: Sidebar panel with driver, subnet, state, connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Color-coded&lt;/strong&gt;: Networks by driver (bridge=blue, overlay=green), containers by state (running=green, stopped=red)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fullscreen mode&lt;/strong&gt;: For large topologies&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Container Marketplace (Business)
&lt;/h3&gt;

&lt;p&gt;Curated app marketplace for one-click Docker Compose deployments.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Searchable catalog with category filtering&lt;/li&gt;
&lt;li&gt;Featured and verified app badges&lt;/li&gt;
&lt;li&gt;User ratings and reviews&lt;/li&gt;
&lt;li&gt;Configurable deployment fields&lt;/li&gt;
&lt;li&gt;Community app submission&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Proxy Simplification: Nginx-Only
&lt;/h2&gt;

&lt;p&gt;Caddy and Nginx Proxy Manager backends have been &lt;strong&gt;completely removed&lt;/strong&gt; — ~6,000 lines of dead code eliminated. Nginx is now the sole reverse proxy backend, always enabled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;New capabilities:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DNS-01 wildcard certificates&lt;/strong&gt;: &lt;code&gt;*.example.com&lt;/code&gt; via Cloudflare DNS API&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Docker exec mode&lt;/strong&gt;: When nginx runs in a container, usulnet uses the Docker API to execute &lt;code&gt;nginx -t&lt;/code&gt; and &lt;code&gt;nginx -s reload&lt;/code&gt; inside it — no local nginx binary needed&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sidebar search&lt;/strong&gt;: Compact filter input below the logo, filters navigation in real-time, Escape clears&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Already in usulnet
&lt;/h2&gt;

&lt;p&gt;If you're discovering usulnet for the first time, here's what the platform already includes:&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Docker
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Containers&lt;/strong&gt;: Full lifecycle — create, start, stop, restart, pause, kill, remove. Bulk operations, real-time stats, settings editor, filesystem browser.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Images&lt;/strong&gt;: Pull, inspect, remove, prune. Docker Hub + private registries. Layer history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Volumes&lt;/strong&gt;: CRUD + built-in file browser for volume contents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Networks&lt;/strong&gt;: Bridge, overlay, macvlan. Connect/disconnect containers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stacks&lt;/strong&gt;: Docker Compose deployment from YAML, Git repos, or built-in catalog (20 apps).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker Swarm&lt;/strong&gt;: Initialize clusters, manage nodes, scale services, promote/demote, live service logs, rollback.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trivy scanning&lt;/strong&gt;: CVE detection with severity classification per container and image&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security scoring&lt;/strong&gt;: 0-100 composite score per container and across infrastructure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SBOM generation&lt;/strong&gt;: CycloneDX and SPDX formats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RBAC&lt;/strong&gt;: 46 granular permissions, custom roles, team-based scoping&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2FA/TOTP&lt;/strong&gt;: Google Authenticator, backup codes, account lockout&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LDAP/OIDC&lt;/strong&gt;: Active Directory, OAuth2 (GitHub, Google, Microsoft)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit logging&lt;/strong&gt;: Every action logged to PostgreSQL with IP, timestamp, details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AES-256-GCM encryption&lt;/strong&gt; for all secrets at rest&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Monitoring &amp;amp; Alerting
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Real-time CPU, memory, network, disk metrics per container and per host&lt;/li&gt;
&lt;li&gt;Threshold-based alert rules (OK → Pending → Firing → Resolved)&lt;/li&gt;
&lt;li&gt;11 notification channels (Email, Slack, Discord, Telegram, Gotify, ntfy, PagerDuty, Opsgenie, Teams, Webhook)&lt;/li&gt;
&lt;li&gt;Docker event stream with filtering&lt;/li&gt;
&lt;li&gt;Prometheus &lt;code&gt;/metrics&lt;/code&gt; endpoint&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Backup &amp;amp; Recovery
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Back up containers, volumes, or stacks&lt;/li&gt;
&lt;li&gt;Cron-based scheduling with retention policies&lt;/li&gt;
&lt;li&gt;S3, MinIO, Azure Blob, GCS, Backblaze B2, SFTP, local&lt;/li&gt;
&lt;li&gt;gzip/zstd compression&lt;/li&gt;
&lt;li&gt;One-click restore&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Multi-Node
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Master/agent architecture with NATS + JetStream&lt;/li&gt;
&lt;li&gt;Internal PKI with mTLS for agent-master communication&lt;/li&gt;
&lt;li&gt;Auto-deploy agents via SSH from the web UI&lt;/li&gt;
&lt;li&gt;Gateway routing — API requests auto-route to the correct node&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Developer Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Terminal&lt;/strong&gt;: Multi-tab browser terminal (xterm.js) — container exec + host SSH&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monaco Editor&lt;/strong&gt;: VS Code editor in the browser for container/host files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Neovim&lt;/strong&gt;: Neovim with lazy.nvim in the browser via WebSocket&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File browsers&lt;/strong&gt;: Container filesystem, host filesystem, SFTP browser&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;15 developer utilities&lt;/strong&gt;: Base64, JSON formatter, UUID generator, regex tester, CIDR calculator, JWT decoder, and more&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snippets&lt;/strong&gt; and &lt;strong&gt;command cheat sheet&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Connections &amp;amp; Integrations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;SSH (password/key auth, tunnels, port forwarding)&lt;/li&gt;
&lt;li&gt;RDP/VNC via Guacamole (no client software needed)&lt;/li&gt;
&lt;li&gt;Database browser (PostgreSQL, MySQL, MongoDB, Redis, SQLite)&lt;/li&gt;
&lt;li&gt;LDAP browser&lt;/li&gt;
&lt;li&gt;Git integration (Gitea, GitHub, GitLab — repos, PRs, issues, CI/CD)&lt;/li&gt;
&lt;li&gt;Container registry browser (Docker Hub, GHCR, private OCI registries)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Automation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Outgoing webhooks with retry and delivery logs&lt;/li&gt;
&lt;li&gt;Auto-deploy on Git push&lt;/li&gt;
&lt;li&gt;Runbooks with approval gates&lt;/li&gt;
&lt;li&gt;Scheduled jobs UI for all background tasks&lt;/li&gt;
&lt;li&gt;Image update detection with batch apply + rollback&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reverse Proxy
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Nginx with auto-HTTPS (Let's Encrypt)&lt;/li&gt;
&lt;li&gt;HTTP-01 and DNS-01 (wildcard) certificate support&lt;/li&gt;
&lt;li&gt;TCP/UDP stream proxying&lt;/li&gt;
&lt;li&gt;Docker exec mode for containerized nginx&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Operations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Docker daemon configuration (&lt;code&gt;daemon.json&lt;/code&gt;) from the web UI — 50+ settings across 6 categories with risk badges&lt;/li&gt;
&lt;li&gt;Drift detection (expected vs actual container state)&lt;/li&gt;
&lt;li&gt;Change events feed (audit trail of infrastructure changes)&lt;/li&gt;
&lt;li&gt;Resource cost optimization (rightsizing recommendations)&lt;/li&gt;
&lt;li&gt;Session recording and replay&lt;/li&gt;
&lt;li&gt;Operations calendar&lt;/li&gt;
&lt;li&gt;Compliance PDF reports (CIS Docker Benchmark)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;Go 1.25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web&lt;/td&gt;
&lt;td&gt;Chi v5 router&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Templates&lt;/td&gt;
&lt;td&gt;Templ (compiled, type-safe)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSS&lt;/td&gt;
&lt;td&gt;Tailwind CSS (standalone CLI, no Node.js)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Alpine.js + HTMX&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Terminal&lt;/td&gt;
&lt;td&gt;xterm.js v5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Editor&lt;/td&gt;
&lt;td&gt;Monaco v0.52 + Neovim&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DNS&lt;/td&gt;
&lt;td&gt;miekg/dns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;PostgreSQL 16 (54 migrations)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache&lt;/td&gt;
&lt;td&gt;Redis 8 (TLS)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Messaging&lt;/td&gt;
&lt;td&gt;NATS 2.12 (JetStream)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth&lt;/td&gt;
&lt;td&gt;JWT + OAuth2/OIDC + LDAP + TOTP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scanner&lt;/td&gt;
&lt;td&gt;Trivy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binary&lt;/td&gt;
&lt;td&gt;~70 MB, no Node.js/Python runtime&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Deploy in 60 Seconds
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/fr4nsys/usulnet/main/deploy/install.sh | &lt;span class="nb"&gt;sudo &lt;/span&gt;bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Auto-generates all secrets, starts PostgreSQL + Redis + NATS + Nginx + Guacamole. Access at &lt;code&gt;https://your-server:7443&lt;/code&gt; — default login: &lt;code&gt;admin&lt;/code&gt; / &lt;code&gt;usulnet&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/fr4nsys/usulnet" rel="noopener noreferrer"&gt;github.com/fr4nsys/usulnet&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Website&lt;/strong&gt;: &lt;a href="https://usulnet.com" rel="noopener noreferrer"&gt;usulnet.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href="https://docs.usulnet.com" rel="noopener noreferrer"&gt;docs.usulnet.com&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you find usulnet useful, a star on GitHub goes a long way.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>docker</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I built a self-hosted Docker platform in Go</title>
      <dc:creator>Fran</dc:creator>
      <pubDate>Mon, 09 Feb 2026 22:13:46 +0000</pubDate>
      <link>https://forem.com/fransys/i-built-a-self-hosted-docker-platform-in-go-1j7g</link>
      <guid>https://forem.com/fransys/i-built-a-self-hosted-docker-platform-in-go-1j7g</guid>
      <description>&lt;p&gt;usulnet — Self-hosted Docker management platform&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvonrr73cjwov5fudp3k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftvonrr73cjwov5fudp3k.png" alt=" " width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I've been building usulnet, a self-hosted platform for managing Docker&lt;br&gt;
infrastructure. It's a single Go binary that handles containers, images,&lt;br&gt;
volumes, networks, stacks, security scanning, backups, monitoring,&lt;br&gt;
reverse proxy, SSH/RDP/database connections, and multi-node deployments&lt;br&gt;
— all from one web UI.&lt;/p&gt;

&lt;p&gt;Key highlights:&lt;br&gt;
• Single binary (~50 MB), no Node.js or Python dependencies&lt;br&gt;
• Trivy security scanning with CVE detection and scoring&lt;br&gt;
• Multi-node master/agent architecture with NATS + mTLS&lt;br&gt;
• Built-in terminal (xterm.js), code editor (Monaco), Neovim in browser&lt;br&gt;
• 11 notification channels (Slack, Discord, Telegram, PagerDuty, etc.)&lt;br&gt;
• RBAC with 44+ permissions, 2FA, LDAP/OIDC&lt;br&gt;
• Backup &amp;amp; restore to S3/local with cron scheduling&lt;br&gt;
• Reverse proxy management (Caddy + Nginx Proxy Manager)&lt;br&gt;
• Full REST API with OpenAPI 3.0 docs&lt;/p&gt;

&lt;p&gt;Tech stack: Go, Chi, Templ, Tailwind CSS, Alpine.js, HTMX, PostgreSQL,&lt;br&gt;
Redis, NATS.&lt;/p&gt;

&lt;p&gt;Fast deploy (60 seconds, auto-generated secrets):&lt;br&gt;
  &lt;code&gt;curl -fsSL https://raw.githubusercontent.com/fr4nsys/usulnet/main/deploy/install.sh | bash&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/fr4nsys/usulnet" rel="noopener noreferrer"&gt;https://github.com/fr4nsys/usulnet&lt;/a&gt;&lt;br&gt;
License: AGPL-3.0&lt;/p&gt;

&lt;p&gt;This is the first public beta (v26.2.0). It's functional and used in&lt;br&gt;
production, but there may be rough edges. Bug reports and feedback are&lt;br&gt;
very welcome — please open an issue on GitHub.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ifjy0sny89m2r64u088.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ifjy0sny89m2r64u088.png" alt=" " width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1oivuw7t55t1i5zxw3m4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1oivuw7t55t1i5zxw3m4.png" alt=" " width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>docker</category>
      <category>go</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
