<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Alessio Battistutta</title>
    <description>The latest articles on Forem by Alessio Battistutta (@thatsme).</description>
    <link>https://forem.com/thatsme</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3817925%2F59f63640-0131-4969-b0e3-832fb504aac5.jpg</url>
      <title>Forem: Alessio Battistutta</title>
      <link>https://forem.com/thatsme</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/thatsme"/>
    <language>en</language>
    <item>
      <title>Tomato — Visual DAG editor for NixOS configurations</title>
      <dc:creator>Alessio Battistutta</dc:creator>
      <pubDate>Sat, 11 Apr 2026 07:52:12 +0000</pubDate>
      <link>https://forem.com/thatsme/tomato-visual-dag-editor-for-nixos-configurations-47j6</link>
      <guid>https://forem.com/thatsme/tomato-visual-dag-editor-for-nixos-configurations-47j6</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmajwd1k0ygo7ft4k915.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmajwd1k0ygo7ft4k915.jpg" alt=" " width="800" height="398"&gt;&lt;/a&gt;&lt;br&gt;
Visual hierarchical graph editor that generates configuration.nix and deploys via nixos-rebuild switch.&lt;/p&gt;

&lt;p&gt;Nodes are Nix fragments, Gateways descend into subgraphs (floors)&lt;br&gt;
NixOS merges the composed fragments automatically.&lt;br&gt;
One-click deploy to a real NixOS machine, OODN registry for ambient config (hostname, timezone, stateVersion...)&lt;/p&gt;

&lt;p&gt;Pre-built stacks (Grafana+Prometheus, Web Server, etc.)&lt;/p&gt;

&lt;p&gt;Built with Elixir/Phoenix. Early stage but working end-to-end.&lt;br&gt;
Please share comments, ideas, improvements request.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/thatsme/Tomato" rel="noopener noreferrer"&gt;Tomato&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nixos</category>
      <category>elixir</category>
      <category>infrastructure</category>
      <category>opensource</category>
    </item>
    <item>
      <title>The Complexity Trap: What Tainter Teaches Us About Agentic Systems</title>
      <dc:creator>Alessio Battistutta</dc:creator>
      <pubDate>Wed, 08 Apr 2026 07:20:01 +0000</pubDate>
      <link>https://forem.com/thatsme/the-complexity-trap-what-tainter-teaches-us-about-agentic-systems-ahf</link>
      <guid>https://forem.com/thatsme/the-complexity-trap-what-tainter-teaches-us-about-agentic-systems-ahf</guid>
      <description>&lt;p&gt;You've felt it. The codebase that fights back. The abstraction layer nobody dares touch. The microservice split that made sense three years ago and now requires a dedicated team just to operate. Joseph Tainter had a name for this in 1988 — and it's darker than technical debt.&lt;/p&gt;

&lt;p&gt;Tainter's thesis in The Collapse of Complex Societies is deceptively simple: societies don't collapse because they fail — they collapse because complexity stops paying for itself. Every layer added to solve a problem yields diminishing returns, while the cost of maintaining that layer keeps rising. At some point, the math inverts. Complexity becomes the problem.&lt;/p&gt;

&lt;p&gt;Software engineers live this every day. The hotfix that births three workarounds. The codebase that becomes load-bearing scar tissue. Eventually, more engineering time is spent managing existing complexity than producing new value — the Tainter inflection point, in code form.&lt;/p&gt;

&lt;p&gt;But deterministic systems at least collapse predictably. The failure modes are traceable. Call graphs, dependency trees, config sprawl — you can reason about what broke and why. It breaks the same way every time. Classic Tainter curve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic systems break the model.
&lt;/h2&gt;

&lt;p&gt;When you chain LLM calls into autonomous workflows, the complexity isn't just structural — it's behavioral and non-reproducible. Every LLM call is a sample from a probability distribution. Chain enough of them and the system's emergent behavior is the product of those distributions. Variance doesn't cancel — it compounds. You haven't built a function; you've built a stochastic process dressed as one.&lt;br&gt;
This is where Tainter gets darker. The natural response to unpredictable LLM output is mitigation: guardrails, validators, retry logic, output sanitizers, confidence thresholds, fallback chains. Each layer adds complexity to manage the chaos of the layer below. But each mitigation layer is itself stochastic — it too samples, classifies, decides. You end up adding complexity that is also unpredictable. The complexity meant to tame variance introduces new variance. The guardrail needs a guardrail.&lt;br&gt;
Tainter would recognize this immediately: complexity generating the very problems it was meant to solve.&lt;br&gt;
The collapse vector in most agentic frameworks is that they don't respect the boundary between stochastic and deterministic. They trust LLM output structurally — parse it, route on it, act on it — and then patch the failures reactively with more stochastic layers. The epistemic problem is you can't enumerate the failure modes of a compounded probability distribution. The system becomes too unpredictable to reason about, too entangled to refactor. Collapse — not with a bang, but as silent behavioral drift nobody can explain or reproduce.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architectural response is a clean membrane.
&lt;/h2&gt;

&lt;p&gt;You cannot fully determinize a stochastic system without destroying what makes it useful. The LLM's value is its probabilistic nature — generalization, inference under ambiguity, flexible intent parsing. The goal isn't to eliminate stochasticity; it's to bound it tightly and treat everything that crosses the boundary as untrusted input.&lt;/p&gt;

&lt;p&gt;This is the core design principle behind &lt;a href="https://github.com/thatsme/AlexClaw" rel="noopener noreferrer"&gt;AlexClaw&lt;/a&gt; — a BEAM-native AI agent framework built for regulated, air-gapped infrastructure where "&lt;em&gt;&lt;strong&gt;just call the cloud API&lt;/strong&gt;&lt;/em&gt;" isn't an option. The LLM touches only the intent parsing and skill selection layer. Every output crosses a sanitization choke point before it can influence system state. Downstream — OTP supervision trees, capability tokens, a PolicyEngine with explicit AuthContext — is pure deterministic BEAM. The stochastic surface is small, explicit, and bounded.&lt;/p&gt;

&lt;p&gt;Everything that matters about system reliability lives outside the stochastic layer.&lt;/p&gt;

&lt;p&gt;Most agentic frameworks make the opposite choice, often implicitly. They're optimized for capability — what the agent can do — without an explicit model of where probabilistic reasoning should stop and deterministic execution should begin. That's a Tainter trap: complexity added for capability, with the collapse cost deferred and compounded.&lt;/p&gt;

&lt;p&gt;The question worth asking before adding the next agent layer isn't "&lt;em&gt;&lt;strong&gt;what can this enable?&lt;/strong&gt;&lt;/em&gt;" It's "&lt;em&gt;&lt;strong&gt;where does this sit on the stochastic/deterministic membrane, and what does it cost when it's wrong?&lt;/strong&gt;&lt;/em&gt;"&lt;br&gt;
Tainter's societies couldn't rewrite themselves. We can. But only if we draw the boundary before the complexity makes that choice for us.&lt;/p&gt;

</description>
      <category>complexity</category>
      <category>architecture</category>
      <category>ai</category>
      <category>elixir</category>
    </item>
    <item>
      <title>AlexClaw: A BEAM-Native Personal AI Agent Built on Elixir/OTP</title>
      <dc:creator>Alessio Battistutta</dc:creator>
      <pubDate>Tue, 17 Mar 2026 10:10:58 +0000</pubDate>
      <link>https://forem.com/thatsme/alexclaw-a-beam-native-personal-ai-agent-built-on-elixirotp-25ma</link>
      <guid>https://forem.com/thatsme/alexclaw-a-beam-native-personal-ai-agent-built-on-elixirotp-25ma</guid>
      <description>&lt;h1&gt;
  
  
  AlexClaw: A BEAM-Native Personal AI Agent Built on Elixir/OTP
&lt;/h1&gt;

&lt;p&gt;AlexClaw is a personal autonomous AI agent that monitors RSS feeds, web sources, GitHub repositories, and Google services — accumulates knowledge in PostgreSQL, executes multi-step workflows on schedule, and communicates via Telegram. It runs entirely on your infrastructure.&lt;/p&gt;

&lt;p&gt;The key architectural decision: the BEAM VM is the runtime, not a container for Python-style orchestration. Supervision trees, ETS caching, GenServers, and PubSub are the actual building blocks — not abstractions bolted on top.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/thatsme/AlexClaw" rel="noopener noreferrer"&gt;github.com/thatsme/AlexClaw&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9b3dkgx00w5lf7eiox95.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9b3dkgx00w5lf7eiox95.jpg" alt="AlexClaw Dashboard" width="800" height="413"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What It Does
&lt;/h2&gt;

&lt;p&gt;AlexClaw runs workflows — multi-step pipelines that combine skills (RSS collection, web search, LLM summarization, API calls, browser automation) and deliver results to Telegram. Workflows run on cron schedules or on demand.&lt;/p&gt;

&lt;p&gt;A typical workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Collect&lt;/strong&gt; — fetch 8 RSS feeds concurrently, deduplicate against memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Score&lt;/strong&gt; — batch-score 20+ article titles in a single LLM call for relevance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Summarize&lt;/strong&gt; — pass the top items through an LLM transform with a prompt template&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deliver&lt;/strong&gt; — send the briefing to Telegram&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This runs every morning at 7:00 with zero interaction.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Telegram &amp;lt;──&amp;gt; Gateway (GenServer) &amp;lt;──&amp;gt; Dispatcher (pattern matching)
                                            │
                                      SkillSupervisor
                                     (DynamicSupervisor)
                                            │
                              ┌─────────────┼─────────────┐
                           RSS            Research      GitHub
                          Skill            Skill     Security Review
                                            │
                                       LLM Router
                              (Gemini / Anthropic / Local)
                                            │
                                ┌───────────┴───────────┐
                             Memory                  Config
                          (pgvector)             (DB + ETS + PubSub)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Supervision Tree (13 children)
&lt;/h3&gt;

&lt;p&gt;The application starts 13 supervised processes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repo&lt;/strong&gt; — PostgreSQL connection pool (Ecto)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PubSub&lt;/strong&gt; — config change broadcast to all processes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TaskSupervisor&lt;/strong&gt; — supervised fire-and-forget tasks (workflow execution, background reviews)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UsageTracker&lt;/strong&gt; — ETS owner for LLM call counters, persisted to DB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Config.Loader&lt;/strong&gt; — seeds environment variables into DB, loads into ETS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LogBuffer&lt;/strong&gt; — in-memory ring buffer (500 entries) attached to Erlang Logger&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google.TokenManager&lt;/strong&gt; — OAuth2 token lifecycle with auto-refresh&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RateLimiter.Server&lt;/strong&gt; — ETS-based login rate limiting with periodic purge&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SkillSupervisor&lt;/strong&gt; — DynamicSupervisor for isolated skill execution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduler&lt;/strong&gt; — Quantum cron scheduler&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SchedulerSync&lt;/strong&gt; — syncs DB workflow schedules into Quantum jobs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gateway&lt;/strong&gt; — Telegram long-polling bot&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Endpoint&lt;/strong&gt; — Phoenix HTTP server (LiveView admin UI)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every async operation (workflow runs, GitHub reviews, background tasks) executes under &lt;code&gt;Task.Supervisor&lt;/code&gt; — crashes are reported, not silently lost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why the BEAM
&lt;/h3&gt;

&lt;p&gt;The BEAM gives you things for free that other runtimes require libraries or infrastructure for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Process isolation&lt;/strong&gt; — a failed RSS fetch doesn't affect a concurrent research query. Each skill runs in its own process.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supervision&lt;/strong&gt; — if a GenServer crashes, it restarts. The application recovers without external health checks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ETS&lt;/strong&gt; — in-process shared memory tables for config cache, usage counters, rate limiting, and token caching. No Redis needed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PubSub&lt;/strong&gt; — config changes broadcast to all processes immediately. No polling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lightweight concurrency&lt;/strong&gt; — RSS feeds are fetched concurrently with &lt;code&gt;Task.async_stream&lt;/code&gt;. Workflow steps run sequentially but the workflow itself runs in a supervised task.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The LLM Router
&lt;/h2&gt;

&lt;p&gt;Every LLM call in AlexClaw declares a tier: &lt;code&gt;:light&lt;/code&gt;, &lt;code&gt;:medium&lt;/code&gt;, &lt;code&gt;:heavy&lt;/code&gt;, or &lt;code&gt;:local&lt;/code&gt;. The router selects the cheapest available model for that tier and falls back automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;light&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  &lt;span class="s"&gt;gemini_flash → haiku → lm_studio → ollama → custom providers&lt;/span&gt;
&lt;span class="na"&gt;medium&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gemini_pro → sonnet → lm_studio → ollama → custom providers&lt;/span&gt;
&lt;span class="na"&gt;heavy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  &lt;span class="s"&gt;opus → lm_studio → ollama → custom providers&lt;/span&gt;
&lt;span class="na"&gt;local&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  &lt;span class="s"&gt;lm_studio → ollama → custom providers&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Daily usage is tracked per provider in ETS and persisted to PostgreSQL. Each provider has a configurable daily limit. When a provider hits its limit, the router skips it and tries the next one.&lt;/p&gt;

&lt;p&gt;Custom providers (any OpenAI-compatible endpoint) can be added via the admin UI. This means you can run multiple local models on LM Studio — same host, different model names, each with its own tier and limit.&lt;/p&gt;

&lt;p&gt;A fully local deployment with zero API keys works — set &lt;code&gt;LMSTUDIO_ENABLED=true&lt;/code&gt; and all tiers route to your local model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost Tracking
&lt;/h3&gt;

&lt;p&gt;The router doesn't just fall back — it actively minimizes cost:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RSS relevance scoring uses &lt;code&gt;:light&lt;/code&gt; tier (Gemini Flash free tier: 250 calls/day)&lt;/li&gt;
&lt;li&gt;Research synthesis uses &lt;code&gt;:medium&lt;/code&gt; tier&lt;/li&gt;
&lt;li&gt;Deep reasoning uses &lt;code&gt;:heavy&lt;/code&gt; tier (explicit only, never auto-selected)&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;:local&lt;/code&gt; tier bypasses all cloud providers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Usage counters reset at midnight UTC via &lt;code&gt;Process.send_after&lt;/code&gt; in the UsageTracker GenServer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Runtime Configuration
&lt;/h2&gt;

&lt;p&gt;All settings live in PostgreSQL, cached in ETS, editable at runtime via the admin UI. No restart required for any change.&lt;/p&gt;

&lt;p&gt;On first boot, &lt;code&gt;Config.Loader&lt;/code&gt; seeds default values from environment variables. After that, the database is the source of truth. When a value changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Written to PostgreSQL&lt;/li&gt;
&lt;li&gt;Updated in ETS cache&lt;/li&gt;
&lt;li&gt;Broadcast via Phoenix PubSub&lt;/li&gt;
&lt;li&gt;All subscribed processes see the change immediately&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Categories include: identity/persona, LLM API keys and limits, Telegram settings, GitHub tokens, Google OAuth, rate limiting thresholds, prompt templates, and skill-specific config.&lt;/p&gt;

&lt;p&gt;The system prompt is fully configurable — persona name, base prompt, and per-skill context fragments are all config keys. Zero hardcoded strings.&lt;/p&gt;




&lt;h2&gt;
  
  
  Workflow Engine
&lt;/h2&gt;

&lt;p&gt;Workflows are multi-step pipelines stored in PostgreSQL. Each step specifies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Skill name&lt;/strong&gt; — which registered skill to execute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Config JSON&lt;/strong&gt; — step-level overrides (different repo, different API token, different prompt)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM tier/provider&lt;/strong&gt; — override the default routing for this step&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;input_from&lt;/strong&gt; — pull input from a specific earlier step (not just the previous one)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Provider routing has three levels of specificity:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Step-level &lt;code&gt;llm_tier&lt;/code&gt; and &lt;code&gt;llm_model&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Workflow-level &lt;code&gt;default_provider&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Global tier-based fallback chain&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  12 Registered Skills
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;rss_collector&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fetch RSS feeds, batch-score relevance, notify&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;web_search&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search DuckDuckGo, fetch results, synthesize answer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;web_browse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fetch URL, extract text, answer questions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;research&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Deep research with memory context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;conversational&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Free-text LLM conversation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;telegram_notify&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Send workflow output to Telegram&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;llm_transform&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run a prompt template through the LLM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;api_request&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Authenticated HTTP requests with retries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;github_security_review&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PR/commit diff security analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;google_calendar&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fetch upcoming Google Calendar events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;google_tasks&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List and create Google Tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;web_automation&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Browser recording and headless replay&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51n24oxymvqmu74yfzio.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51n24oxymvqmu74yfzio.jpg" alt="AlexClaw Skills" width="800" height="575"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Skills implement the &lt;code&gt;AlexClaw.Skill&lt;/code&gt; behaviour — two callbacks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight elixir"&gt;&lt;code&gt;&lt;span class="nv"&gt;@callback&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="no"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nv"&gt;@callback&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="ss"&gt;:ok&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;term&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="ss"&gt;:error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;term&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding a new skill is one module, one registry entry.&lt;/p&gt;




&lt;h2&gt;
  
  
  Memory System
&lt;/h2&gt;

&lt;p&gt;PostgreSQL + pgvector for persistent knowledge storage.&lt;/p&gt;

&lt;p&gt;Each memory entry has a &lt;code&gt;kind&lt;/code&gt; (&lt;code&gt;:news_item&lt;/code&gt;, &lt;code&gt;:summary&lt;/code&gt;, &lt;code&gt;:conversation&lt;/code&gt;, &lt;code&gt;:security_review&lt;/code&gt;), content, optional source URL, JSONB metadata, optional vector embedding, and optional TTL.&lt;/p&gt;

&lt;p&gt;Search uses cosine similarity on pgvector when embeddings are available, with keyword (ILIKE) fallback. Deduplication by URL prevents the same article from being scored and notified twice.&lt;/p&gt;

&lt;p&gt;The RSS collector stores every worthy item. The research skill stores summaries. The conversational skill stores both user messages and assistant responses for context continuity.&lt;/p&gt;




&lt;h2&gt;
  
  
  Security Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Session-based authentication&lt;/strong&gt; on all admin routes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TOTP 2FA&lt;/strong&gt; — optional two-factor for sensitive workflow execution (setup via Telegram, 2-minute challenge expiry)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Login rate limiting&lt;/strong&gt; — ETS-based, configurable max attempts and block duration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HMAC-SHA256 webhook verification&lt;/strong&gt; — raw body cached before JSON parsing for correct signature verification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telegram chat_id filtering&lt;/strong&gt; — rejects messages from unauthorized users&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timing-safe comparison&lt;/strong&gt; — &lt;code&gt;Plug.Crypto.secure_compare&lt;/code&gt; for all secret comparisons&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Admin UI
&lt;/h2&gt;

&lt;p&gt;Phoenix LiveView — fully server-rendered, no JavaScript hooks. 12 pages covering every aspect of the system.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8ro6gdzhf6rl47818in.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8ro6gdzhf6rl47818in.jpg" alt="AlexClaw Workflows" width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Deployment
&lt;/h2&gt;

&lt;p&gt;Single &lt;code&gt;docker compose up -d&lt;/code&gt;. The stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Elixir release&lt;/strong&gt; — compiled OTP release (Alpine-based, ~125 MB runtime)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL 17 + pgvector&lt;/strong&gt; — persistent storage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web automator&lt;/strong&gt; (optional) — Python/Playwright sidecar for browser automation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Minimum requirements: Docker, a Telegram bot token, and at least one LLM provider (can be fully local).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/thatsme/AlexClaw.git
&lt;span class="nb"&gt;cd &lt;/span&gt;AlexClaw
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
&lt;span class="c"&gt;# Edit .env with your credentials&lt;/span&gt;
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;Elixir 1.19 / OTP 28 / BEAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web framework&lt;/td&gt;
&lt;td&gt;Phoenix 1.7 + LiveView&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HTTP server&lt;/td&gt;
&lt;td&gt;Bandit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;PostgreSQL 17 + pgvector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HTTP client&lt;/td&gt;
&lt;td&gt;Req&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cron scheduler&lt;/td&gt;
&lt;td&gt;Quantum&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RSS parsing&lt;/td&gt;
&lt;td&gt;SweetXml&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HTML parsing&lt;/td&gt;
&lt;td&gt;Floki&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2FA&lt;/td&gt;
&lt;td&gt;NimbleTOTP + EQRCode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browser automation&lt;/td&gt;
&lt;td&gt;Playwright (Python sidecar)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://github.com/thatsme/AlexClaw/blob/main/ROADMAP.md" rel="noopener noreferrer"&gt;ROADMAP.md&lt;/a&gt; in the repository tracks planned features. Current priorities include embedding integration for semantic memory search, additional LLM providers, and workflow branching logic.&lt;/p&gt;




&lt;p&gt;AlexClaw is open source under the Apache License 2.0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/thatsme/AlexClaw" rel="noopener noreferrer"&gt;github.com/thatsme/AlexClaw&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Built by &lt;a href="https://github.com/thatsme" rel="noopener noreferrer"&gt;Alessio Battistutta&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>elixir</category>
      <category>ai</category>
      <category>opensource</category>
      <category>agents</category>
    </item>
    <item>
      <title>How We Told a Stranger's Node Where Its Cache Should Be</title>
      <dc:creator>Alessio Battistutta</dc:creator>
      <pubDate>Wed, 11 Mar 2026 07:49:08 +0000</pubDate>
      <link>https://forem.com/thatsme/how-we-told-a-strangers-node-where-its-cache-should-be-209b</link>
      <guid>https://forem.com/thatsme/how-we-told-a-strangers-node-where-its-cache-should-be-209b</guid>
      <description>&lt;p&gt;We connected to a remote BEAM node we don't own. No access to its source code. No instrumentation planted beforehand. No agents installed, no probes compiled into the target. Pure black-box runtime observation over Distributed Erlang.&lt;/p&gt;

&lt;p&gt;From that, we produced a concrete architectural recommendation: &lt;em&gt;your schema registry is hitting PostgreSQL 354 times per observation window for metadata that almost never changes — that should be ETS, not a database table&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This article walks through how we got there — 7 observation sessions against a live Elixir application, progressing from coarse process-level metrics to function-level tracing that revealed the internal structure of code we'd never read.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Target: Nexus
&lt;/h2&gt;

&lt;p&gt;Nexus is an Elixir API gateway that manages dynamic PostgreSQL connections, runtime schema metadata, and table operations. It exposes a REST API on port 4040 and handles CRUD, aggregations, search, CSV export, batch operations, and multi-database routing — all backed by Ecto dynamic repos.&lt;/p&gt;

&lt;p&gt;We exercised it with an integration test suite: 61 tests across 8 categories including connection management, extended CRUD (JSONB, NULLs, hierarchical categories), edge cases (SQL injection, unicode, 10KB strings), concurrent operations (50 concurrent inserts, 5000+ req/s bursts), large datasets (50k rows), and a 60-second stress test with 20 concurrent workers.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;Giulia runs as two Docker containers from the same image:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Worker&lt;/strong&gt; (port 4000) — static analysis engine: AST indexing, Knowledge Graph, dependency topology, complexity metrics, embeddings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor&lt;/strong&gt; (port 4001) — runtime observer: connects to target BEAM nodes via Distributed Erlang, collects snapshots at configurable intervals, pushes data to the Worker for fusion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The observation workflow is command-driven:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;giulia-observe start nexus@192.168.10.174                           &lt;span class="c"&gt;# process-level only&lt;/span&gt;
giulia-observe start nexus@192.168.10.174 cookie 5000 Nexus.Repo    &lt;span class="c"&gt;# + function tracing&lt;/span&gt;
&amp;lt;run your workload&amp;gt;
giulia-observe stop nexus@192.168.10.174                            &lt;span class="c"&gt;# finalize fused profile&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  First Bug: &lt;code&gt;--sname&lt;/code&gt; vs &lt;code&gt;--name&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;First connection failed — Nexus uses long names (&lt;code&gt;--name nexus@192.168.10.174&lt;/code&gt;), Giulia used short names (&lt;code&gt;--sname worker&lt;/code&gt;). Erlang refuses to connect across name modes. One flag change in &lt;code&gt;docker-compose.yml&lt;/code&gt; fixed it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Process-Level Profiling: Finding the Logger
&lt;/h2&gt;

&lt;p&gt;The first five sessions used process-level observation: BEAM metrics (memory, process count, scheduler run queue) and top-process rankings by CPU. This tells you &lt;em&gt;which modules own the hottest processes&lt;/em&gt; but not which functions are being called.&lt;/p&gt;

&lt;p&gt;We ran the same workload under different configurations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Session&lt;/th&gt;
&lt;th&gt;Duration&lt;/th&gt;
&lt;th&gt;Run Queue&lt;/th&gt;
&lt;th&gt;Top Module&lt;/th&gt;
&lt;th&gt;#2 Module&lt;/th&gt;
&lt;th&gt;Logger CPU&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. First contact (21/49 tests)&lt;/td&gt;
&lt;td&gt;180s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;17&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;:proc_lib&lt;/code&gt; 100%&lt;/td&gt;
&lt;td&gt;(below threshold)&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. Idle baseline&lt;/td&gt;
&lt;td&gt;21s&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;:code_server&lt;/code&gt; 38.7%&lt;/td&gt;
&lt;td&gt;(below threshold)&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Full suite, info log&lt;/td&gt;
&lt;td&gt;77s&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;:proc_lib&lt;/code&gt; 54.3%&lt;/td&gt;
&lt;td&gt;&lt;code&gt;:logger_std_h_default&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;28.5%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Full suite, warning log&lt;/td&gt;
&lt;td&gt;77s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;:logger_std_h_default&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;:proc_lib&lt;/code&gt; 33.2%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;39.1%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. Full suite, async handler&lt;/td&gt;
&lt;td&gt;71s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;:proc_lib&lt;/code&gt; 98.9%&lt;/td&gt;
&lt;td&gt;(below threshold)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Gone&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Session 1 flagged scheduler contention (run queue of 17 — more tasks queued than schedulers available). Sessions 3-4 found something more interesting: Erlang's default logger handler does synchronous IO. Every log call blocks the calling process until the write completes. The overhead is per-call, not per-byte — so raising the log level to &lt;code&gt;:warning&lt;/code&gt; (Session 4) actually made it worse. Fewer messages, but each one still blocks, and now the logger is a larger proportion of total work.&lt;/p&gt;

&lt;p&gt;The fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight elixir"&gt;&lt;code&gt;&lt;span class="ss"&gt;:logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update_handler_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:default&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;%{&lt;/span&gt;
  &lt;span class="ss"&gt;sync_mode_qlen:&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;# async until queue hits 1000&lt;/span&gt;
  &lt;span class="ss"&gt;drop_mode_qlen:&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;# start dropping above 2000&lt;/span&gt;
  &lt;span class="ss"&gt;flush_qlen:&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;           &lt;span class="c1"&gt;# emergency flush&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Session 5 confirmed: logger gone from top modules, run queue dropped from 5 to 2. Bottleneck eliminated.&lt;/p&gt;

&lt;p&gt;But process-level profiling had hit its ceiling. Session 5 showed &lt;code&gt;:proc_lib&lt;/code&gt; at 98.9% CPU — which is like saying "OTP processes are running." Every GenServer, Task, and supervised process runs through &lt;code&gt;:proc_lib&lt;/code&gt;. We needed to see inside.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Stress Test: 919 ops/s, Zero Errors
&lt;/h2&gt;

&lt;p&gt;Before diving deeper, we ran the full suite including a 60-second sustained stress test. Nexus handled the load cleanly — the interesting question was &lt;em&gt;how&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Duration:       60.0s          Latency (ms):
Total ops:      55,151           min:     0.7
Errors:         0 (0.0%)        median:  5.4
Throughput:     919 ops/s        p95:    93.1
                                 p99:   156.9
Operations breakdown:            max:   287.2
  select    13,619    paginate   5,489
  insert    11,203    aggregate  5,440
  count      8,291    distinct   2,818
  search     5,553    exists     2,738
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Memory flat at 87-88 MB. Process count stable at 459. The system handles load well — but &lt;em&gt;how&lt;/em&gt; it handles it is where the interesting findings live.&lt;/p&gt;




&lt;h2&gt;
  
  
  Function-Level Tracing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Remote Tracing Problem
&lt;/h3&gt;

&lt;p&gt;Erlang's &lt;code&gt;:erlang.trace&lt;/code&gt; only works on the local node. Giulia's Monitor runs on a different node than Nexus. There are three ways to run code on a remote BEAM:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;:rpc.call(node, Module, :function, args)&lt;/code&gt;&lt;/strong&gt; — module must be loaded on the target&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anonymous function via RPC&lt;/strong&gt; — closures reference their defining module; fails with &lt;code&gt;:undef&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Code.eval_string&lt;/code&gt; via RPC&lt;/strong&gt; — source code as a string, compiled on the target using only its stdlib&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Options 1 and 2 fail because Nexus doesn't have Giulia's code, and Erlang closures aren't portable. Build 135 uses option 3: a self-contained Elixir code string sent via &lt;code&gt;:rpc.call(node, Code, :eval_string, [code])&lt;/code&gt;. It spawns a collector, enables tracing for up to 2 seconds (kill switch at 1,000 events), aggregates call counts, and returns. The entire lifecycle runs on the target with zero Giulia dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session 6: Single Module (73 seconds, 11 snapshots)
&lt;/h3&gt;

&lt;p&gt;Tracing &lt;code&gt;Nexus.Registry.TableRegistry&lt;/code&gt; only:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Calls&lt;/th&gt;
&lt;th&gt;Samples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;TableRegistry.get/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9,238&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The single hottest function on Nexus. Every CRUD operation goes through &lt;code&gt;get/1&lt;/code&gt; to look up the schema definition before executing. But the call count alone doesn't tell you &lt;em&gt;how&lt;/em&gt; it's implemented.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session 7: Dual Module Trace (85 seconds, 11 snapshots)
&lt;/h3&gt;

&lt;p&gt;Tracing both &lt;code&gt;Nexus.Registry.TableRegistry&lt;/code&gt; and &lt;code&gt;Nexus.Repo&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nexus.Repo — 7,000 total calls:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Calls&lt;/th&gt;
&lt;th&gt;Share&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prepare_opts/2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1,266&lt;/td&gt;
&lt;td&gt;18.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;default_options/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1,263&lt;/td&gt;
&lt;td&gt;18.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1,263&lt;/td&gt;
&lt;td&gt;18.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prepare_query/3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;956&lt;/td&gt;
&lt;td&gt;13.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;all/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;593&lt;/td&gt;
&lt;td&gt;8.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;one/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;353&lt;/td&gt;
&lt;td&gt;5.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;insert_all/3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;307&lt;/td&gt;
&lt;td&gt;4.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;put_dynamic_repo/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;0.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Nexus.Registry.TableRegistry — 2,004 total calls:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Calls&lt;/th&gt;
&lt;th&gt;Share&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;354&lt;/td&gt;
&lt;td&gt;17.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prepare_opts/2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;354&lt;/td&gt;
&lt;td&gt;17.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;default_options/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;354&lt;/td&gt;
&lt;td&gt;17.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prepare_query/3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;292&lt;/td&gt;
&lt;td&gt;14.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;all/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;186&lt;/td&gt;
&lt;td&gt;9.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;one/1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;106&lt;/td&gt;
&lt;td&gt;5.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;insert_all/3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;62&lt;/td&gt;
&lt;td&gt;3.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And there it is. &lt;code&gt;TableRegistry&lt;/code&gt; has &lt;code&gt;prepare_opts/2&lt;/code&gt;, &lt;code&gt;default_options/1&lt;/code&gt;, &lt;code&gt;prepare_query/3&lt;/code&gt; — the telltale signature of an Ecto Repo. &lt;strong&gt;The schema registry is backed by a database table, not ETS.&lt;/strong&gt; Every schema lookup is a round-trip to PostgreSQL.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Numbers Say
&lt;/h2&gt;

&lt;h3&gt;
  
  
  TableRegistry Should Be ETS (The Headline Finding)
&lt;/h3&gt;

&lt;p&gt;354 &lt;code&gt;get/1&lt;/code&gt; calls to PostgreSQL per observation window, for table metadata that almost never changes. Boot-time load, invalidate on DDL events, done. This eliminates ~2,000 function calls per window (the full Ecto pipeline runs on every registry query) and removes an entire class of unnecessary database round-trips from the hot path.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Workload Is 87% Reads
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Read operations:  2,209 (87%)    Read/Write ratio: 6.9:1
Write operations:    320 (13%)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Repo.get/1&lt;/code&gt; at 1,263 calls is the single hottest operation. Combined with &lt;code&gt;all/1&lt;/code&gt; (593) and &lt;code&gt;one/1&lt;/code&gt; (353), reads dominate. A read-through ETS cache with TTL on frequently-accessed records would cut database pressure significantly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ecto Overhead Is Structural (Don't Touch It)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Ecto pipeline calls:  3,485 (prepare_opts + default_options + prepare_query)
Actual DB operations:  2,530
Overhead ratio:         1.38x
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;1.38 pipeline calls per operation. This is Ecto doing its job — query preparation, option merging, type validation. You're not going to bypass it, and the savings would be marginal. The real wins come from not hitting the database at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dynamic Repo Switching Works Fine
&lt;/h3&gt;

&lt;p&gt;12 &lt;code&gt;put_dynamic_repo/1&lt;/code&gt; calls across 2,530 operations (1 switch per 211 ops). Connection switching is not a concern.&lt;/p&gt;

&lt;h3&gt;
  
  
  The BEAM Has Headroom
&lt;/h3&gt;

&lt;p&gt;Run queue of 1, stable 90 MB memory, ~30 DB ops/sec during trace windows. The VM isn't stressed. If there's a throughput bottleneck, it's on the PostgreSQL side — &lt;code&gt;pg_stat_statements&lt;/code&gt; would confirm.&lt;/p&gt;

&lt;h3&gt;
  
  
  Priority
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;th&gt;Effort&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Cache &lt;code&gt;TableRegistry&lt;/code&gt; in ETS&lt;/td&gt;
&lt;td&gt;Eliminates ~2,000 DB calls/window&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Read cache for hot &lt;code&gt;get/1&lt;/code&gt; paths&lt;/td&gt;
&lt;td&gt;Reduces 87% read load&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Profile PostgreSQL side&lt;/td&gt;
&lt;td&gt;Confirms where latency lives&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What Each Layer Revealed
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Finding&lt;/th&gt;
&lt;th&gt;Process-Level&lt;/th&gt;
&lt;th&gt;Function-Level&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Synchronous logger at 39% CPU&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logger worse at warning level (Sessions 3→4→5)&lt;/td&gt;
&lt;td&gt;Yes (comparative)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;TableRegistry.get/1&lt;/code&gt; is the #1 function&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9,238 calls&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TableRegistry is database-backed&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ecto signatures&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;87% read-heavy workload&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Ratio analysis&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.38x Ecto overhead (structural, ignore it)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynamic repo switching is fine&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12 switches&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BEAM has headroom, DB is the constraint&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Throughput data&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Process-level found the logger bug — a real operational fix. Function-level found the architectural issue — a schema registry that should never have been database-backed. One is a config change. The other changes how you design the system.&lt;/p&gt;

&lt;p&gt;Both came from a node we don't own, running code we've never read, with zero instrumentation on the target.&lt;/p&gt;




&lt;h2&gt;
  
  
  How This Works
&lt;/h2&gt;

&lt;p&gt;No source access. No agents installed on the target. No probes compiled in. Giulia's Monitor connects to any BEAM node via Distributed Erlang, collects process snapshots and function-level traces by sending self-contained code strings over RPC (&lt;code&gt;Code.eval_string&lt;/code&gt;), and pushes the results to the Worker for AST correlation and fused profiling. The entire observation lifecycle runs on the target using only its own stdlib — Giulia never needs to be loaded there.&lt;/p&gt;

&lt;p&gt;Connect, observe, trace, recommend.&lt;/p&gt;

</description>
      <category>elixir</category>
      <category>distributedsystems</category>
      <category>postgres</category>
      <category>testing</category>
    </item>
  </channel>
</rss>
