<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Abhishek</title>
    <description>The latest articles on Forem by Abhishek (@abhishek_mishra_01).</description>
    <link>https://forem.com/abhishek_mishra_01</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3823436%2Fb4461944-468c-45de-8e3b-25c45caf3b35.jpeg</url>
      <title>Forem: Abhishek</title>
      <link>https://forem.com/abhishek_mishra_01</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/abhishek_mishra_01"/>
    <language>en</language>
    <item>
      <title>Most AI Research Pipelines Produce Noise Not Decisions</title>
      <dc:creator>Abhishek</dc:creator>
      <pubDate>Wed, 22 Apr 2026 12:47:04 +0000</pubDate>
      <link>https://forem.com/abhishek_mishra_01/most-ai-research-pipelines-produce-noise-not-decisions-1h92</link>
      <guid>https://forem.com/abhishek_mishra_01/most-ai-research-pipelines-produce-noise-not-decisions-1h92</guid>
      <description>&lt;p&gt;I'm going to say something that'll bother some people:&lt;/p&gt;

&lt;p&gt;Most teams think they're doing AI-powered research. They're not. They're just accelerating search.&lt;/p&gt;

&lt;p&gt;Real leverage the kind that compounds comes from building a &lt;strong&gt;repeatable research system&lt;/strong&gt; that converts raw information into decisions, specs, and execution paths.&lt;/p&gt;

&lt;p&gt;There's a difference between using AI and operating it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Loop I Keep Seeing
&lt;/h2&gt;

&lt;p&gt;Here's what most engineers do:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Search on Perplexity → summarize in ChatGPT → expand in Claude&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It &lt;em&gt;feels&lt;/em&gt; productive. But nothing compounds. Because the output is still unstructured insight  not operational clarity. Every session starts from zero. Nothing persists.&lt;/p&gt;

&lt;p&gt;Let me show you what it looks like when it actually works.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Research Is a System, Not a Session
&lt;/h2&gt;

&lt;p&gt;Most people treat research like a one-time activity. You open a tab, ask an AI, read a summary, and move on. Nothing persists.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operators treat it as a pipeline:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Signal → Pattern → Insight → Decision → Artifact
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your research doesn't produce &lt;strong&gt;artifacts&lt;/strong&gt; — docs, specs, structured datasets  it resets every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world example:&lt;/strong&gt;&lt;br&gt;
Instead of summarizing "AI agents in DevOps," build a living problem map:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pain points (from GitHub issues, forums)&lt;/li&gt;
&lt;li&gt;Frequency of occurrence&lt;/li&gt;
&lt;li&gt;Cost impact per incident&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Technical note:&lt;/strong&gt;&lt;br&gt;
Store outputs in structured formats — JSON, Notion DB, vector store. That enables retrieval and iteration, not rework.&lt;/p&gt;

&lt;p&gt;Teams that systematize research reduce decision cycles from weeks to hours.&lt;/p&gt;


&lt;h2&gt;
  
  
  2. Stop Mixing Signal Gathering With Thinking
&lt;/h2&gt;

&lt;p&gt;You're running two different cognitive tasks in the same session:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data collection&lt;/strong&gt; (breadth)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning&lt;/strong&gt; (depth)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's inefficient. Here's the correct split:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Stage 1 Signal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Perplexity AI&lt;/td&gt;
&lt;td&gt;Pull trends, extract discussions, surface patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Stage 2 Thinking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;Cluster problems, rank by impact, find root causes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Stage 3 Structure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;td&gt;Convert into structured docs, define systems and workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;p&gt;Different models are optimized for different tasks — retrieval vs reasoning vs long-context structuring. Multi-model workflows outperform single-model dependency. That's not an opinion, it's just how the tools are built.&lt;/p&gt;


&lt;h2&gt;
  
  
  3. The Output of Research Is a Decision, Not a Summary
&lt;/h2&gt;

&lt;p&gt;Summaries feel useful. They're not.&lt;/p&gt;

&lt;p&gt;If your research ends with &lt;em&gt;"Here are 10 insights…"&lt;/em&gt; you've stopped too early.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It should end with:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What are we building?&lt;/li&gt;
&lt;li&gt;For whom, specifically?&lt;/li&gt;
&lt;li&gt;Why now?&lt;/li&gt;
&lt;li&gt;What metric improves, and by how much?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bad&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;output:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="s2"&gt;"Developers struggle with cloud setup"&lt;/span&gt;&lt;span class="w"&gt;

 &lt;/span&gt;&lt;span class="err"&gt;Good&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;output:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="s2"&gt;"Reduce time-to-first-deploy from 2 hours → 10 minutes
 using an AI deployment agent for indie dev teams on AWS"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Force every AI output into a decision template: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem → User → Metric → Constraint&lt;/strong&gt;. Clarity at this stage determines whether you build signal — or noise.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Prompting Is Not the Lever — Interfaces Are
&lt;/h2&gt;

&lt;p&gt;I keep hearing "next-level prompts" as if better wording unlocks some hidden power. It doesn't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompts are not hacks. They are interfaces.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each research step should have a defined input schema, expected output schema, and hard constraints.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt; &lt;span class="na"&gt;Vague&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;market&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;trends"&lt;/span&gt;

&lt;span class="na"&gt;Structured&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="na"&gt;ROLE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;        &lt;span class="s"&gt;Market Analyst&lt;/span&gt;
&lt;span class="na"&gt;INPUT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;       &lt;span class="s"&gt;Raw signals (links, forum notes)&lt;/span&gt;
&lt;span class="na"&gt;OUTPUT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;Ranked problem list by cost + frequency&lt;/span&gt;
&lt;span class="na"&gt;CONSTRAINT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  &lt;span class="s"&gt;B2B infra problems only, ignore consumer noise&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Structured prompting reduces variance and increases reproducibility. Teams with defined AI interfaces can scale research across people and systems. Teams without them keep running one-off sessions.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Compounding Comes From Memory + Iteration
&lt;/h2&gt;

&lt;p&gt;The biggest mistake — even from experienced engineers — is starting from scratch every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your system should:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store past research outputs&lt;/li&gt;
&lt;li&gt;Reuse insights across sessions&lt;/li&gt;
&lt;li&gt;Refine over time, not restart&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;A compounding research loop looks like this:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Day 01 → Collect 50 raw problem signals
Day 03 → Cluster into 10 categories
Day 07 → Identify top 3 high-signal opportunities
Day 14 → Build system architecture from validated insight
Day 30 → Feed usage data back in → refine the map
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use embeddings + retrieval to re-inject prior knowledge into future prompts. A well-organized Notion DB with tagged outputs gets you 80% of the way there without building anything complex.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddzy4vneep5i0i4cfhio.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddzy4vneep5i0i4cfhio.png" alt="AI Research arhiecture "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Multi-Agent Research System Prompt
&lt;/h2&gt;

&lt;p&gt;Here's the actual system prompt I use for structured AI research. Drop it in, replace the domain, run it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ROLE: Senior AI Research + Systems Design Agent

OBJECTIVE:
Identify high-value, real-world problems from the market
and convert them into production-grade system opportunities.

You are NOT a chatbot.
You operate as a structured, multi-agent system internally.

-----


SYSTEM EXECUTION MODEL — run these sub-agents in sequence:

1. MARKET SIGNAL AGENT
   → Collect real friction from forums, GitHub issues, reviews

2. PROBLEM EXTRACTION AGENT
   → Convert signals into structured problem statements

3. ROOT CAUSE ANALYSIS AGENT
   → Identify causes, not symptoms

4. OPPORTUNITY PRIORITIZATION AGENT
   → Rank by frequency × severity × AI suitability

5. SYSTEM DESIGN AGENT
   → Design architecture: input → AI → deterministic → output

6. VALIDATION AGENT
   → Challenge assumptions, define MVP, list unknowns

-----


STAGE 1 — MARKET SIGNAL EXTRACTION
Sources: GitHub issues · StackOverflow · G2/Capterra · Reddit · Engineering blogs
Output: 10–20 recurring problem signals with frequency + severity

STAGE 2 — PROBLEM DEFINITION
For each: Who is the user? What is broken? Where in workflow? Measurable impact?
Output: Top 5 clearly defined, high-impact problems

STAGE 3 — ROOT CAUSE ANALYSIS
Break into: Technical limitations · Workflow gaps · Tool fragmentation · Cognitive load
Output: Root cause map per problem

STAGE 4 — OPPORTUNITY PRIORITIZATION
Rank by: Frequency · Severity · Urgency · AI suitability
Output: Top 1–2 opportunities with strongest potential

STAGE 5 — SYSTEM DESIGN (CRITICAL)
Design production-grade architecture:
  Input Layer
  → Processing Layer (LLM vs deterministic split)
  → Orchestration Layer
  → Execution Layer (APIs/tools)
  → Feedback + learning loop

Define clearly:
  What AI handles vs what deterministic systems handle
Include: Failure modes + mitigation + evaluation metrics

STAGE 6 — VALIDATION
Challenge: Is this already solved? Why do current solutions fail?
Output: Risks · Unknowns · MVP scope

-----


CONSTRAINTS:
- No generic ideas
- No surface-level summaries
- No "AI will solve this" without system design
- Be specific, technical, and decision-oriented

DOMAIN INPUT: [INSERT YOUR DOMAIN]
Example: DevOps · FinTech · SaaS Onboarding · Healthcare AI
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Operator Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1 Signal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Perplexity AI&lt;/td&gt;
&lt;td&gt;Fast external discovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2 Pattern&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;Clustering, ranking, reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3 Structure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;td&gt;Long-form docs, architecture, workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4  Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Notion / Vector DB&lt;/td&gt;
&lt;td&gt;Persistent research base&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;5 Decision&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Output artifacts&lt;/td&gt;
&lt;td&gt;Problem statements, specs, architecture drafts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What Changes When You Do This Right
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Research becomes &lt;strong&gt;repeatable&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Insights become &lt;strong&gt;assets&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Decisions become &lt;strong&gt;faster&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Execution becomes &lt;strong&gt;inevitable&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Lines Worth Keeping
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"Research that doesn't produce decisions is just organized reading."&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"AI doesn't make you smarter. It makes your process visible."&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"The goal isn't more information. It's less ambiguity."&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"Prompts are temporary. Systems persist."&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"If your research resets, you don't have a system."&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;You don't need better prompts.&lt;/p&gt;

&lt;p&gt;You need a system where AI moves you from &lt;strong&gt;signal → decision → execution&lt;/strong&gt; without restarting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Signal → Pattern → Problem → System → Interface → Build → Evaluate → Iterate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most people stop at &lt;code&gt;Signal → Summary&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That's why they never ship.&lt;/p&gt;




&lt;p&gt;AI doesn’t remove the need for thinking.&lt;br&gt;
It removes the cost of iteration.&lt;/p&gt;

&lt;p&gt;If your system is weak, you just reach bad conclusions faster.&lt;/p&gt;

&lt;p&gt;If your system is strong, you compress weeks of research into hours.&lt;/p&gt;

&lt;p&gt;I’m currently building an ACP (AI Control Plane) around this exact model separating signal &lt;br&gt;
ingestion, reasoning, memory, and execution into a single pipeline.&lt;/p&gt;

&lt;p&gt;The goal isn’t better prompts.&lt;br&gt;
It’s a system that doesn’t reset.&lt;br&gt;
I’ll break that down next.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Hashnode → &lt;a href="https://hashnode.com/@abhimishra-devops90" rel="noopener noreferrer"&gt;https://hashnode.com/@abhimishra-devops90&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;LinkedIn → &lt;a href="https://www.linkedin.com/in/abhishek-mishra-aws-devops/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/abhishek-mishra-aws-devops/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>automation</category>
    </item>
    <item>
      <title>Most Problems Don't Need AI (And That's Fine)</title>
      <dc:creator>Abhishek</dc:creator>
      <pubDate>Mon, 20 Apr 2026 13:18:44 +0000</pubDate>
      <link>https://forem.com/abhishek_mishra_01/most-problems-dont-need-ai-and-thats-fine-26if</link>
      <guid>https://forem.com/abhishek_mishra_01/most-problems-dont-need-ai-and-thats-fine-26if</guid>
      <description>&lt;h2&gt;
  
  
  The Question Nobody Asks
&lt;/h2&gt;

&lt;p&gt;Everyone's asking: How can I use AI for this?&lt;/p&gt;

&lt;p&gt;The better question is: Should I?&lt;/p&gt;

&lt;p&gt;Because here's what I learned the hard way:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI solves a very specific class of problems.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And most of your problems aren't in that class.&lt;/p&gt;




&lt;p&gt;What Happened When I Built for SRE&lt;/p&gt;

&lt;p&gt;Last month, I started building an AI system for SRE.&lt;/p&gt;

&lt;p&gt;The idea wasn’t to generate text.&lt;br&gt;
It was to simulate real incident response.&lt;/p&gt;

&lt;p&gt;So I built an environment where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;systems break&lt;/li&gt;
&lt;li&gt;signals appear (logs, metrics)&lt;/li&gt;
&lt;li&gt;actions change the state&lt;/li&gt;
&lt;li&gt;wrong decisions are penalized&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not what would you do?&lt;br&gt;
But:&lt;/p&gt;

&lt;p&gt;What happens when you actually act?&lt;/p&gt;



&lt;p&gt;What I Realized Quickly&lt;/p&gt;

&lt;p&gt;AI looks good when it explains problems.&lt;/p&gt;

&lt;p&gt;It struggles when it has to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;decide under uncertainty&lt;/li&gt;
&lt;li&gt;take the correct sequence of actions&lt;/li&gt;
&lt;li&gt;handle multi-step failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In SRE, being almost right is still wrong.&lt;/p&gt;



&lt;p&gt;Where Systems Break&lt;/p&gt;

&lt;p&gt;The hardest part wasn’t generation.&lt;/p&gt;

&lt;p&gt;It was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;choosing the right action&lt;/li&gt;
&lt;li&gt;in the right order&lt;/li&gt;
&lt;li&gt;based on incomplete signals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s where most AI systems fail.&lt;br&gt;
Not in demos.&lt;br&gt;
In decisions.&lt;/p&gt;



&lt;p&gt;The Lesson&lt;/p&gt;

&lt;p&gt;SRE made one thing clear:&lt;/p&gt;

&lt;p&gt;AI is useful when it supports decisions.&lt;br&gt;
Not when it replaces them.&lt;/p&gt;



&lt;p&gt;New Rule&lt;/p&gt;

&lt;p&gt;If your system requires:&lt;/p&gt;

&lt;p&gt;consistent, correct decisions under pressure&lt;/p&gt;

&lt;p&gt;Then AI alone is not enough.&lt;/p&gt;

&lt;p&gt;You need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;structure&lt;/li&gt;
&lt;li&gt;constraints&lt;/li&gt;
&lt;li&gt;validation&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  The Pattern I Started Seeing
&lt;/h2&gt;

&lt;p&gt;After that failure, I looked at every AI tool I'd built or evaluated.&lt;/p&gt;

&lt;p&gt;I found a pattern in what actually worked:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI works when the problem has high variance inputs and acceptable variance in outputs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let me break that down.&lt;/p&gt;


&lt;h2&gt;
  
  
  High Variance Inputs
&lt;/h2&gt;

&lt;p&gt;This means: the problem receives unpredictable, unstructured, or creative inputs.&lt;/p&gt;

&lt;p&gt;Examples that fit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User queries in natural language&lt;/li&gt;
&lt;li&gt;Bug reports written by non-technical users&lt;/li&gt;
&lt;li&gt;Code snippets in any language/framework&lt;/li&gt;
&lt;li&gt;API documentation across different vendors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples that don't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Structured database queries&lt;/li&gt;
&lt;li&gt;Configuration files with known schemas&lt;/li&gt;
&lt;li&gt;Metrics from monitoring tools&lt;/li&gt;
&lt;li&gt;Git commit hashes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your input is already structured and predictable, you don't need AI. You need a parser.&lt;/p&gt;


&lt;h2&gt;
  
  
  Acceptable Variance in Outputs
&lt;/h2&gt;

&lt;p&gt;This means: the user can tolerate (and even expects) some variation in the response.&lt;/p&gt;

&lt;p&gt;Examples that fit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code suggestions (developer reviews before accepting)&lt;/li&gt;
&lt;li&gt;Draft responses to support tickets (human edits before sending)&lt;/li&gt;
&lt;li&gt;Initial test case generation (QA refines coverage)&lt;/li&gt;
&lt;li&gt;Summarizing long error logs (engineer investigates further)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples that don't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deploying to production&lt;/li&gt;
&lt;li&gt;Merging pull requests&lt;/li&gt;
&lt;li&gt;Granting permissions&lt;/li&gt;
&lt;li&gt;Processing payments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;If the output must be deterministic and correct 100% of the time, AI is the wrong tool.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You need rules, not models.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Real Litmus Test
&lt;/h2&gt;

&lt;p&gt;Here's the framework I use now before writing any AI code:&lt;/p&gt;

&lt;p&gt;Prefer deterministic systems when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inputs are structured&lt;/li&gt;
&lt;li&gt;Rules are stable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use AI when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rules explode combinatorially&lt;/li&gt;
&lt;li&gt;Context interpretation is required&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Best systems = hybrid (AI + constraints)&lt;/p&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfgltc1jl8hf43ukct26.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfgltc1jl8hf43ukct26.png" alt="A whiteboard diagram showing production AI system design, including RAG architecture, agent workflows, system boundaries, and real-world failure modes—emphasizing that AI is a component within a larger, controlled system." width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Where AI Actually Belongs in Developer Tooling
&lt;/h2&gt;

&lt;p&gt;After building systems that worked and failed, here's what I've seen succeed:&lt;/p&gt;
&lt;h3&gt;
  
  
  Code Search &amp;amp; Navigation
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Why it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Developers search using imprecise natural language&lt;/li&gt;
&lt;li&gt;Codebase context is massive and varied&lt;/li&gt;
&lt;li&gt;"Close enough" results are useful&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
"Find where we handle rate limiting for the API"&lt;/p&gt;

&lt;p&gt;Traditional search fails because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We might call it "throttling" in some files&lt;/li&gt;
&lt;li&gt;Implementation is split across middleware and handlers&lt;/li&gt;
&lt;li&gt;No single keyword matches everything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI search understands intent.&lt;/p&gt;
&lt;h3&gt;
  
  
  Error Explanation &amp;amp; Debugging Hints
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Why it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Error messages are inconsistent across languages/frameworks&lt;/li&gt;
&lt;li&gt;Developers need context, not just stack traces&lt;/li&gt;
&lt;li&gt;Suggested fixes don't auto-execute&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NullPointerException at line 47
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AI can correlate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Recent code changes&lt;/li&gt;
&lt;li&gt;Similar past issues&lt;/li&gt;
&lt;li&gt;Common patterns in that file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It doesn't fix it. It points you in the right direction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test Case Generation (First Draft)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Why it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing tests is high-effort, low-creativity work&lt;/li&gt;
&lt;li&gt;Generated tests are always reviewed&lt;/li&gt;
&lt;li&gt;Edge cases emerge through iteration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Given a function, generate initial unit tests covering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Happy path&lt;/li&gt;
&lt;li&gt;Null inputs&lt;/li&gt;
&lt;li&gt;Boundary conditions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Developer refines from there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automated Code Review
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Why it fails:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context requires understanding team conventions&lt;/li&gt;
&lt;li&gt;False positives erode trust&lt;/li&gt;
&lt;li&gt;Deterministic linters already catch syntax issues&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Automatic Refactoring
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Why it fails:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Breaking changes require 100% accuracy&lt;/li&gt;
&lt;li&gt;Semantic meaning must be preserved exactly&lt;/li&gt;
&lt;li&gt;One mistake ships to production&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Auto-Generated API Clients
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Why it fails:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAPI specs already exist (structured input)&lt;/li&gt;
&lt;li&gt;Code generation tools are deterministic&lt;/li&gt;
&lt;li&gt;No ambiguity to resolve&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Mistake I See Most Often
&lt;/h2&gt;

&lt;p&gt;Developers use AI because it's impressive.&lt;/p&gt;

&lt;p&gt;Not because it's the right tool.&lt;/p&gt;

&lt;p&gt;I've done this. We all have.&lt;/p&gt;

&lt;p&gt;You see a cool demo and think: &lt;em&gt;"I could use that for..."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;But here's what actually happens:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You bolt AI onto a problem that doesn't need it&lt;/li&gt;
&lt;li&gt;It works 90% of the time&lt;/li&gt;
&lt;li&gt;The 10% failure rate is unpredictable&lt;/li&gt;
&lt;li&gt;You spend more time handling edge cases than you saved&lt;/li&gt;
&lt;li&gt;You rebuild it without AI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Save yourself the cycle.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with the simplest solution that could work.&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Decide Now
&lt;/h2&gt;

&lt;p&gt;When someone asks me to build an AI feature, I ask:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"What happens if this gives the wrong answer?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If the answer is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The user reviews and corrects it → Maybe AI&lt;/li&gt;
&lt;li&gt;We waste some time → Maybe AI&lt;/li&gt;
&lt;li&gt;We lose customer trust → Not AI&lt;/li&gt;
&lt;li&gt;We break production → Definitely not AI&lt;/li&gt;
&lt;li&gt;Nothing, it's just slower → Definitely not AI&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Problems Actually Worth Solving
&lt;/h2&gt;

&lt;p&gt;After shipping AI to production, here's what I've learned:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Good AI problems share these traits:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Ambiguity is inherent&lt;/strong&gt; – The problem can't be reduced to rules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-loop is natural&lt;/strong&gt; – Someone reviews the output anyway&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Value comes from speed, not perfection&lt;/strong&gt; – 80% solution in 5 seconds beats 100% solution in 5 hours&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The alternative is hiring more people&lt;/strong&gt; – You're augmenting human judgment, not replacing deterministic code&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;For developer tooling specifically:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The sweet spot is: &lt;strong&gt;Tasks developers already do manually that require understanding context but not making critical decisions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing boilerplate tests&lt;/li&gt;
&lt;li&gt;Searching codebases semantically&lt;/li&gt;
&lt;li&gt;Explaining unfamiliar error messages&lt;/li&gt;
&lt;li&gt;Generating first-draft documentation&lt;/li&gt;
&lt;li&gt;Suggesting variable names&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deploying code&lt;/li&gt;
&lt;li&gt;Approving changes&lt;/li&gt;
&lt;li&gt;Granting access&lt;/li&gt;
&lt;li&gt;Modifying production configs&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I'm Building Differently Now
&lt;/h2&gt;

&lt;p&gt;Instead of starting with What AI can do, I start with:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What are developers doing repeatedly that's:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mentally tedious&lt;/strong&gt; (not challenging, just annoying)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context-heavy&lt;/strong&gt; (requires reading lots of code)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-critical&lt;/strong&gt; (mistakes are cheap)"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then I ask:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Could a junior developer do this after reading the context?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If yes → AI might help.&lt;/p&gt;

&lt;p&gt;If no → I'm trying to automate judgment, and that won't work.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Hard Truth
&lt;/h2&gt;

&lt;p&gt;Most problems don't need AI.&lt;/p&gt;

&lt;p&gt;They need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Better documentation&lt;/li&gt;
&lt;li&gt;Clearer error messages&lt;/li&gt;
&lt;li&gt;Simpler abstractions&lt;/li&gt;
&lt;li&gt;Fewer edge cases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI feels like progress because it's new.&lt;/p&gt;

&lt;p&gt;But progress is solving the problem correctly, not impressively.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Practical Exercise
&lt;/h2&gt;

&lt;p&gt;If you're reading this and thinking about an AI feature, try this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write down the problem&lt;/li&gt;
&lt;li&gt;Describe the input (is it structured or chaotic?)&lt;/li&gt;
&lt;li&gt;Describe the acceptable output (is variance okay?)&lt;/li&gt;
&lt;li&gt;Write the deterministic solution (if you can)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If step 4 takes less than 100 lines of code → you don't need AI.&lt;/p&gt;

&lt;p&gt;If step 4 is impossible → AI might be the right tool.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm Doing Tomorrow
&lt;/h2&gt;

&lt;p&gt;I'm going to break down something most engineers skip:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to actually structure an AI system once you've confirmed the problem is worth solving.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because the architecture decisions you make early will determine whether your system is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reliable or brittle&lt;/li&gt;
&lt;li&gt;Maintainable or a black box&lt;/li&gt;
&lt;li&gt;Scalable or a one-off hack&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'll cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Input validation (most failures happen here)&lt;/li&gt;
&lt;li&gt;Prompt orchestration (not just a single call)&lt;/li&gt;
&lt;li&gt;Output schemas (structured responses are non-negotiable)&lt;/li&gt;
&lt;li&gt;Fallback strategies (when AI doesn't know)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Final Thought&lt;br&gt;
We don’t have a shortage of AI techniques.&lt;/p&gt;

&lt;p&gt;RAG. Agents. Workflows. Fine-tuning.&lt;/p&gt;

&lt;p&gt;Those are solved problems at this point.&lt;/p&gt;

&lt;p&gt;What’s not solved is judgment.&lt;/p&gt;

&lt;p&gt;Knowing when AI improves a system &lt;br&gt;
and when it quietly makes it worse.&lt;/p&gt;

&lt;h2&gt;
  
  
  Most failures I’ve seen weren’t because the model was weak.
&lt;/h2&gt;

&lt;p&gt;They failed because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The problem didn’t need AI&lt;/li&gt;
&lt;li&gt;The system lacked constraints&lt;/li&gt;
&lt;li&gt;Or the cost of being wrong was underestimated&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI is not a system. It’s a component.&lt;/p&gt;

&lt;p&gt;And if you design your system like it’s the brain,&lt;br&gt;
it will fail like one.&lt;/p&gt;

&lt;p&gt;If you’re building with AI, the real question isn’t:&lt;/p&gt;

&lt;h2&gt;
  
  
  “Can this work?”
&lt;/h2&gt;

&lt;p&gt;It’s:&lt;br&gt;
“What happens when it’s wrong?”&lt;/p&gt;

&lt;h2&gt;
  
  
  Because that’s where most systems break.
&lt;/h2&gt;

&lt;p&gt;This is Day 1 of documenting how I think about AI systems in production:&lt;br&gt;
what works, what breaks, and where things fail under real-world constraints.&lt;/p&gt;

&lt;p&gt;If you're working on similar problems, I’m especially interested in:&lt;br&gt;
Where did your system fail — and why?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hashnode → &lt;a href="https://hashnode.com/@abhimishra-devops90" rel="noopener noreferrer"&gt;https://hashnode.com/@abhimishra-devops90&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;LinkedIn → &lt;a href="https://www.linkedin.com/in/abhishek-mishra-aws-devops/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/abhishek-mishra-aws-devops/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>webdev</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
