<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: praveen sinha</title>
    <description>The latest articles on Forem by praveen sinha (@pravdexter).</description>
    <link>https://forem.com/pravdexter</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3919348%2F271aef7c-4fca-4f17-8926-38f89e610816.jpg</url>
      <title>Forem: praveen sinha</title>
      <link>https://forem.com/pravdexter</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/pravdexter"/>
    <language>en</language>
    <item>
      <title>GemmaOps Edge: From 373 Alarms to 1 Root Cause Using Local AI (Gemma 4)</title>
      <dc:creator>praveen sinha</dc:creator>
      <pubDate>Fri, 22 May 2026 17:20:14 +0000</pubDate>
      <link>https://forem.com/pravdexter/gemmaops-edge-from-373-alarms-to-1-root-cause-using-local-ai-gemma-4-1cb9</link>
      <guid>https://forem.com/pravdexter/gemmaops-edge-from-373-alarms-to-1-root-cause-using-local-ai-gemma-4-1cb9</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/google-gemma-2026-05-06"&gt;Gemma 4 Challenge: Build with Gemma 4&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;🚨 From 373 alarms to 1 root cause in seconds&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;A production-grade AI reasoning agent that turns a wall of network alarms into clear root-cause analysis — running entirely on your own hardware.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;It is 3 AM. A NOC engineer receives an alert:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"North region customers reporting intermittent connectivity drops. Possible fiber cut or BGP flap."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The system shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;373 alarms
&lt;/li&gt;
&lt;li&gt;45 active
&lt;/li&gt;
&lt;li&gt;6 CRITICAL
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The challenge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify root cause
&lt;/li&gt;
&lt;li&gt;Determine blast radius
&lt;/li&gt;
&lt;li&gt;Estimate impact and resolution
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This typically takes &lt;strong&gt;20–120 minutes&lt;/strong&gt; depending on expertise.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GemmaOps Edge&lt;/strong&gt; is a &lt;strong&gt;fully local AI reasoning agent&lt;/strong&gt; that enables operators to query network state in natural language and receive precise, actionable insights.&lt;/p&gt;

&lt;p&gt;While GemmaOps Edge is demonstrated using telecom NOC scenarios, the same architecture applies to any high-volume event-driven system — including cloud observability, microservices monitoring, and enterprise infrastructure platforms.&lt;/p&gt;

&lt;p&gt;🚨 This is not alert summarization — it is reasoning-driven root cause analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Capabilities
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Topology-aware Root Cause Analysis&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-condition Correlation (alarms + topology + history + traffic)&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Service Impact Propagation&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Historical Incident Matching with MTTR estimation&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Natural Language Query Interface&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Example Interaction
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Operator:&lt;/strong&gt; Why is the North region experiencing outages?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BGP SESSION DOWN on CR-NOR-01 (ALM-00196)
&lt;/li&gt;
&lt;li&gt;CE-NOR-02 (ALM-00199) — 1,252+ prefixes withdrawn
&lt;/li&gt;
&lt;li&gt;SERVICE_OUTAGE affecting 2,560 customers
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Historical match:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
INC-2026-017 (BGP failure, MTTR 53 min)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended actions:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check BGP config changes
&lt;/li&gt;
&lt;li&gt;Rollback recent changes
&lt;/li&gt;
&lt;li&gt;Initiate incident bridge
&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────────────────┐
│                            GemmaOps Edge                             │
│              AI-Powered NOC Assistant · Edge Deployment              │
│                                                                      │
│   ┌───────────────────┐                 ┌─────────────────────┐      │
│   │   NOC Dashboard   │◄──────REST─────►│   FastAPI Backend   │      │
│   │    React · TW     │                 │                     │      │
│   └───────────────────┘                 └──────────┬──────────┘      │
│                                                    │                 │
│                                         ┌──────────▼──────────┐      │
│                                         │     Agent Loop      │      │
│                                         │   ReAct · 128K ctx  │      │
│                                         └──────────┬──────────┘      │
│                                                    │                 │
│                                 ┌──────────────────┼───────────────┐ │
│                                 │                  │               │ │
│                       ┌─────────▼────────┐   ┌─────▼──────────┐    │ │
│                       │ Context Builder  │   │Reasoning Engine│    │ │
│                       └──────────────────┘   └────────┬───────┘    │ │
│                                 │                     │            │ │
│                       ┌─────────▼────────┐            │            │ │
│                       │  Tool Registry   │◄───────────┘            │ │
│                       └──────────────────┘                         │ │
│                                 └──────────────────────────────────┘ │
│                                                                      │
│   ┌──────────────────────────────┐     ┌─────────────────────────┐   │
│   │          Data Layer          │     │      Memory Layer       │   │
│   │                              │     │                         │   │
│   │   • NetworkX graph           │     │ • Redis     · short-term│   │
│   │   • FAISS vector index       │     │ • ChromaDB  · long-term │   │
│   │   • Live alarm store         │     │                         │   │
│   └──────────────────────────────┘     └─────────────────────────┘   │
│                                                                      │
│   ┌──────────────────────────────────────────────────────────────┐   │
│   │            Ollama · gemma4:e4b · localhost:11434             │   │
│   └──────────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;✔ Fully local deployment&lt;br&gt;&lt;br&gt;
✔ No cloud/API dependency&lt;br&gt;&lt;br&gt;
✔ Runs on commodity hardware  &lt;/p&gt;


&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;
&lt;h3&gt;
  
  
  ReAct Agent (Reasoning + Acting)
&lt;/h3&gt;

&lt;p&gt;The agent dynamically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reads summarized network state
&lt;/li&gt;
&lt;li&gt;Calls tools based on need
&lt;/li&gt;
&lt;li&gt;Correlates multiple data sources
&lt;/li&gt;
&lt;li&gt;Produces precise RCA output
&lt;/li&gt;
&lt;/ol&gt;


&lt;h3&gt;
  
  
  NOC Tools
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;alarm_search&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fetch active alarms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;topology_lookup&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Get network relationships&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;path_finder&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Analyze routes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;incident_search&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Retrieve historical incidents&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  Context Engineering (Critical Innovation)
&lt;/h3&gt;

&lt;p&gt;Priority-based prompt construction:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;KEY FACTS (highest impact)
&lt;/li&gt;
&lt;li&gt;Query intent
&lt;/li&gt;
&lt;li&gt;Active alarms
&lt;/li&gt;
&lt;li&gt;Topology graph
&lt;/li&gt;
&lt;li&gt;Historical incidents
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;➡ Improved accuracy from ~40% to ~90%&lt;/p&gt;


&lt;h2&gt;
  
  
  The 128K Advantage
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Two Operating Modes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ReAct (6K)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast, tool-driven RCA analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Full Context (128K)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Whole-network reasoning in one pass&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  Why It Matters
&lt;/h3&gt;

&lt;p&gt;Questions like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Which nodes appear in both CRITICAL alarms AND past P1 incidents?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;❌ Cannot be solved by RAG or smaller-context models&lt;br&gt;&lt;br&gt;
✅ Solved using &lt;strong&gt;full-context reasoning&lt;/strong&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  Benchmark Results
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;th&gt;Performance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4B&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;✅ 5/5 (Best)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mistral 7B&lt;/td&gt;
&lt;td&gt;32K&lt;/td&gt;
&lt;td&gt;⚠️ 2/5 (Partial)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 2B&lt;/td&gt;
&lt;td&gt;8K&lt;/td&gt;
&lt;td&gt;❌ 1/5 (Limited)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;➡ The limitation is &lt;strong&gt;context window&lt;/strong&gt;, not model size&lt;/p&gt;


&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/koEjgTlgyhE"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;




&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/praveen-sinha-ai/gemmaops-edge" rel="noopener noreferrer"&gt;https://github.com/praveen-sinha-ai/gemmaops-edge&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Used Gemma 4
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Model Selected
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;gemma4:e4b (4B)&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Model
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Edge Deployment Requirement
&lt;/li&gt;
&lt;li&gt;Runs locally (no GPU required)
&lt;/li&gt;
&lt;li&gt;&amp;lt; 3GB footprint
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;1–4s response time  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reasoning Capability  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Handles multi-condition correlation:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;alarms
&lt;/li&gt;
&lt;li&gt;topology
&lt;/li&gt;
&lt;li&gt;incidents
&lt;/li&gt;
&lt;li&gt;traffic
&lt;/li&gt;
&lt;li&gt;config
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Accuracy vs Efficiency Balance  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;E2B → insufficient reasoning  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;31B → impractical for edge deployment  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;E4B → optimal trade-off  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Two Usage Modes
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;ReAct Agent Mode (6K)
&lt;/li&gt;
&lt;li&gt;Multi-step reasoning
&lt;/li&gt;
&lt;li&gt;Tool-based retrieval
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fast responses  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Full Context Mode (128K)  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Entire dataset in prompt (~43K tokens)  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;No retrieval needed  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Enables deep correlation queries  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Key Insight
&lt;/h3&gt;

&lt;p&gt;The biggest differentiator was not model size —&lt;br&gt;&lt;br&gt;
it was how much data the model could see at once.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Makes This Different
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Not a basic RAG system or generic LLM wrapper
&lt;/li&gt;
&lt;li&gt;Performs &lt;strong&gt;multi-step reasoning with tool execution (ReAct)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Understands &lt;strong&gt;network topology as a graph&lt;/strong&gt;, not just text
&lt;/li&gt;
&lt;li&gt;Combines alarms, topology, and incident history in one reasoning flow
&lt;/li&gt;
&lt;li&gt;Supports &lt;strong&gt;full-network reasoning using 128K context&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Runs fully &lt;strong&gt;local — no cloud, no data exposure&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Produces &lt;strong&gt;specific, verifiable outputs (IDs, nodes, incidents)&lt;/strong&gt; — not vague summaries
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Graph Neural Networks (GNN-based RCA)
&lt;/li&gt;
&lt;li&gt;Predictive failure detection
&lt;/li&gt;
&lt;li&gt;Automated remediation workflows
&lt;/li&gt;
&lt;li&gt;Larger Gemma models (26B, 31B)
&lt;/li&gt;
&lt;li&gt;Domain fine-tuning (3GPP, TM Forum)
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;The biggest insight from building GemmaOps Edge:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The limitation is not model intelligence — it is how much of the system the model can see at once.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By combining:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Structured context engineering
&lt;/li&gt;
&lt;li&gt;Topology-aware reasoning
&lt;/li&gt;
&lt;li&gt;Large context windows (128K)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…it becomes possible to move from &lt;strong&gt;alert noise → precise root cause&lt;/strong&gt; in seconds.&lt;/p&gt;

&lt;p&gt;In a real NOC, that difference is not theoretical:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;2 hours MTTR → 20 minutes
&lt;/li&gt;
&lt;li&gt;Fewer escalations
&lt;/li&gt;
&lt;li&gt;Faster recovery
&lt;/li&gt;
&lt;li&gt;Better customer experience
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Local AI for enterprise operations is no longer a future concept.&lt;br&gt;&lt;br&gt;
With Gemma 4, it is practical today.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech Stack:&lt;/strong&gt; Python, FastAPI, NetworkX, FAISS, Ollama, Gemma 4  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; gemma ai telecom llm fastapi  &lt;/p&gt;




&lt;h2&gt;
  
  
  Feedback &amp;amp; Discussion
&lt;/h2&gt;

&lt;p&gt;I built GemmaOps Edge to solve a very real problem I’ve seen repeatedly in telecom NOCs — too many alarms, too little clarity.&lt;/p&gt;

&lt;p&gt;If you're working on similar problems (telecom, observability, AI agents), I’d genuinely like to hear your thoughts.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What would you improve in this approach?
&lt;/li&gt;
&lt;li&gt;Would you trust this in a real NOC?
&lt;/li&gt;
&lt;li&gt;Any ideas for scaling this further?
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Feel free to drop your questions or suggestions in the comments.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
    </item>
  </channel>
</rss>
