<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Akshat Uniyal</title>
    <description>The latest articles on Forem by Akshat Uniyal (@akshat_uniyal).</description>
    <link>https://forem.com/akshat_uniyal</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3807622%2Fcc64f167-77d3-40cb-92da-2156b7062432.png</url>
      <title>Forem: Akshat Uniyal</title>
      <link>https://forem.com/akshat_uniyal</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/akshat_uniyal"/>
    <language>en</language>
    <item>
      <title>They Accidentally Left the Door Open. We All Walked In.</title>
      <dc:creator>Akshat Uniyal</dc:creator>
      <pubDate>Sat, 04 Apr 2026 11:25:52 +0000</pubDate>
      <link>https://forem.com/akshat_uniyal/they-accidentally-left-the-door-open-we-all-walked-in-4idk</link>
      <guid>https://forem.com/akshat_uniyal/they-accidentally-left-the-door-open-we-all-walked-in-4idk</guid>
      <description>&lt;p&gt;Originally published at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7mt9s8fzkp42965wrnca.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7mt9s8fzkp42965wrnca.png" alt="Claude Issue Summary" width="800" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fud92hil1dj5i1433ausa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fud92hil1dj5i1433ausa.png" alt="Claude Related Numbers" width="800" height="117"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On March 31st, a packaging error pushed a 59.8 MB source map file alongside Anthropic’s Claude Code CLI on npm. Within hours, 513,000 lines of unobfuscated TypeScript were on GitHub, forked tens of thousands of times, the star count climbing toward six figures by nightfall. Anthropic confirmed it quickly: human error, no customer data exposed, a release packaging issue.&lt;/p&gt;

&lt;p&gt;All true. But packaging issue doesn’t quite cover what people found when they started reading.&lt;/p&gt;

&lt;p&gt;What leaked wasn’t model weights or API keys. It was something arguably more revealing — the thinking layer wrapped around the AI. The software that tells Claude Code how to behave in the real world: which tools to use, how to remember things, when to stay quiet, and — as it turns out — when to work without you knowing.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;&lt;em&gt;THE SLEEPING GIANT&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The agent that works while you’re away
&lt;/h2&gt;

&lt;p&gt;Buried in the source is a feature called KAIROS — named after the Ancient Greek concept of the opportune moment. It’s an always-on daemon mode: Claude Code running in the background, on a schedule, without you prompting it. Paired with it is something called autoDream, a process designed to consolidate memory during idle time — merging observations, resolving contradictions, compressing the agent’s context so that when you return, it’s cleaner and more relevant than when you left.&lt;/p&gt;

&lt;p&gt;Most people have been thinking of AI coding tools as reactive. You ask, they answer, they wait. KAIROS is something different — an agent that stays on, keeps working, and maintains its own state between your sessions. Whether that sounds exciting or unsettling probably depends on how much you trust the tool running on your machine at 3am.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“The agent performs memory consolidation while the user is idle… removes logical contradictions and converts vague insights into absolute facts.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxw53xam5ioexf05z6n4n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxw53xam5ioexf05z6n4n.png" alt="Claude Bug" width="800" height="162"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;&lt;em&gt;THE UNCOMFORTABLE DETAIL&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  They called it “Undercover Mode”
&lt;/h2&gt;

&lt;p&gt;That’s the actual name in the codebase. The system prompt reads: you are operating UNDERCOVER in a PUBLIC/ OPEN-SOURCE repository. Do not blow your cover. It’s designed to let Claude make contributions to open-source projects without revealing AI authorship in commit messages or pull requests.&lt;/p&gt;

&lt;p&gt;There’s a legitimate argument for it — some projects reject AI-generated contributions on principle, regardless of quality — but the framing is going to make a lot of people uncomfortable. The question of whether AI-authored code should be disclosed is very much an open one. Building the infrastructure to conceal it, quietly, inside a tool used by thousands of developers, is a choice that deserves more public debate than it’s been getting.&lt;/p&gt;

&lt;p&gt;Then there’s the telemetry. Every time Claude Code launches, it phones home: user ID, session ID, app version, terminal type, org UUID, account UUID, email address. If the network is down, it queues that data locally at ~/.claude/telemetry/ and sends it later. Most developer tools collect something, but few users had a clear picture of the scope — until now.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;&lt;em&gt;THE ENGINEERING REALITY CHECK&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A bug burning 250,000 API calls a day — quietly
&lt;/h2&gt;

&lt;p&gt;This is the part getting the least attention, and it might matter most to practitioners.&lt;/p&gt;

&lt;p&gt;A comment in the production code documents a bug that had been running undetected: 1,279 sessions experiencing 50 or more consecutive failures in a single session — up to 3,272 in a row in some cases — wasting roughly 250,000 API calls per day globally. The fix was three lines of code. Nobody caught it until someone looked. Security researchers who reviewed the leaked source also noted the absence of any visible automated test suite.&lt;/p&gt;

&lt;p&gt;This is a tool actively used by engineering teams at some of the world’s largest companies — writing code, creating pull requests, touching production systems. The gap between that reality and “impressive demo” is something the industry rarely puts in writing. The leak did it by accident.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Every fast-moving software team has skeletons like this. What’s unusual is being able to see them.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;&lt;em&gt;THE MODEL BEHIND THE CURTAIN&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Capybara, and a regression nobody was meant to see
&lt;/h2&gt;

&lt;p&gt;The leaked code confirms an unreleased model internally called Capybara — with variants named Fennec and Numbat — and exposes a detail Anthropic would almost certainly have preferred to announce on its own terms: the current internal build shows a 29–30% false claims rate, a regression from a previous version’s 16.7%. There’s also a flag called an “assertiveness counterweight,” added to stop the model from being too aggressive when rewriting code.&lt;/p&gt;

&lt;p&gt;The team is clearly aware and working on it. But there’s a difference between knowing that AI models hallucinate and seeing the exact percentage sitting in a comment next to a patch note. For anyone calibrating how much to trust these tools in real workflows, that number is more useful than most benchmark leaderboards.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pwumletin87pgb172xl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pwumletin87pgb172xl.png" alt="Claude Security Note" width="800" height="223"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;&lt;em&gt;THE HUMAN FINGERPRINT&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  And then there’s the Tamagotchi
&lt;/h2&gt;

&lt;p&gt;Deep in the source sits ” a hidden digital pet system called ‘ Buddy‘ — think Tamagotchi, but secret”. A deterministic gacha mechanic with species rarity, shiny variants, and a soul description written by Claude on first hatch. Your buddy’s species is seeded from your user ID — same user, same buddy, every time. The species names are deliberately obfuscated in the code, hidden from string searches. Someone built this with care, and quietly shipped it.&lt;/p&gt;

&lt;p&gt;In a week full of headlines about autonomous daemons, stealth commits, and background memory consolidation, the Buddy system is a small reminder that the people building this stuff are, at the end of the day, people. They hide easter eggs. They build the fun parts on a Friday. They leave fingerprints.&lt;/p&gt;

&lt;p&gt;The codebase is permanently public now — mirrored, forked, already being rewritten in Rust. Anthropic will patch and move forward. But for developers who want to understand how a production-grade AI agent actually works under the hood, this leak is, accidentally, the most detailed public documentation that’s ever existed on the subject.&lt;/p&gt;

&lt;p&gt;Sometimes the most useful things aren’t planned.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;About the Author&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://akshatuniyal.com" rel="noopener noreferrer"&gt;Akshat Uniyal&lt;/a&gt;&lt;/strong&gt; writes about Artificial Intelligence, engineering systems, and practical technology thinking.&lt;br&gt;
Explore more articles at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>machinelearning</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Vibe Coding: Revolution, Shortcut, or Just a Fancy Buzzword?</title>
      <dc:creator>Akshat Uniyal</dc:creator>
      <pubDate>Sat, 04 Apr 2026 07:38:52 +0000</pubDate>
      <link>https://forem.com/akshat_uniyal/vibe-coding-revolution-shortcut-or-just-a-fancy-buzzword-1f41</link>
      <guid>https://forem.com/akshat_uniyal/vibe-coding-revolution-shortcut-or-just-a-fancy-buzzword-1f41</guid>
      <description>&lt;p&gt;Originally published at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let me be honest with you. A few weeks ago, I was at a tech meetup and an old colleague walked up to me, eyes lit up, and said — “Bro, I’ve been vibe coding all week. Built an entire app. Zero lines of code written by me.” And I nodded along, the way you do when you don’t want to be the one who kills the mood at a party.&lt;/p&gt;

&lt;p&gt;But on my drive back, I couldn’t stop thinking — do we actually know what we’re talking about when we say “vibe coding”? Or have we collectively decided that saying it confidently is enough?&lt;/p&gt;

&lt;p&gt;Spoiler: it’s a bit of both. And that, my friend, is exactly why we need to talk about it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;” A little knowledge is a dangerous thing.” — Alexander Pope&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  So… what actually is vibe coding?
&lt;/h2&gt;

&lt;p&gt;The term was coined by Andrej Karpathy — one of the original minds behind Tesla’s Autopilot and a co-founder of OpenAI — in early 2025. He described it as a way of coding where you essentially forget that code exists. You talk to an AI, describe what you want, accept whatever it spits out, and keep nudging it until things more or less work. You don’t read the code. You don’t understand it. You just… vibe.&lt;/p&gt;

&lt;p&gt;That’s the origin. Clean, honest, almost playful in its admission.&lt;/p&gt;

&lt;p&gt;What it has become, however, is a whole different story. Today, “vibe coding” is used to mean everything from “I used ChatGPT to write a Python script” to “I’m building a SaaS startup entirely on AI-generated code without a single developer on my team.” The term has been stretched so thin you could see through it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The good stuff — and yes, there genuinely is some
&lt;/h2&gt;

&lt;p&gt;Let’s not be cynical for the sake of it. Vibe coding has real, tangible benefits and dismissing them would be intellectually dishonest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speed.&lt;/strong&gt; If you have an idea and want to see it alive in an afternoon, vibe coding is astonishing. What used to take a developer two weeks — setting up boilerplate, writing CRUD operations, designing basic UI flows — can now be prototyped in hours. For founders validating an idea, for designers who want a clickable demo, for someone just experimenting on a weekend, this is genuinely magical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The gates are finally open.&lt;/strong&gt; For years, building software was gated behind years of learning. Vibe coding has cracked that gate open. A small business owner can now build their own inventory tracker. A teacher can create a custom quiz app for their class. That’s not nothing — that’s actually huge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The boring work goes away.&lt;/strong&gt; Even seasoned developers will tell you — a lot of coding is tedious. Writing the same kind of functions over and over, setting up configs, writing boilerplate. AI handles this now. That’s time freed up for actual thinking.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;” Necessity is the mother of invention. And honestly, laziness might be the father.” — Plato&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Now let’s talk about what nobody wants to say out loud
&lt;/h2&gt;

&lt;p&gt;Here’s where I’ll risk being unpopular.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You can’t debug what you don’t understand.&lt;/strong&gt; When something breaks — and it will break — you’re standing in front of a wall of code you’ve never read, written by an AI that doesn’t actually know what your product is supposed to do. Good luck. I’ve spoken to founders who’ve spent more time untangling AI-generated spaghetti than it would have taken to build the thing properly in the first place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security is not vibing along with you.&lt;/strong&gt; AI models are optimised to produce code that works — not code that’s safe. SQL injections, exposed API keys, missing authentication checks — these aren’t hypothetical. They’re the kind of things that don’t show up until your users’ data is already gone. And the person who vibe-coded the app has no idea where to even look.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The junior developer problem.&lt;/strong&gt; This one keeps me up at night a little. There’s a generation of aspiring developers right now who are using AI to skip the part where you struggle through understanding fundamentals. The struggle, as annoying as it is, is where you actually learn. If you never write a for-loop from scratch, you don’t truly understand iteration. And if you don’t understand iteration, you can’t reason about performance. It’s turtles all the way down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It scales terribly.&lt;/strong&gt; A vibe-coded MVP is one thing. A vibe-coded product with real users, real data, real edge cases? That’s where the cracks start showing — loudly. What AI produces is rarely modular, rarely maintainable, and almost never documented. When you need to hand it off to a real developer, they will look at you with a very specific expression. You’ll know it when you see it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;” All that glitters is not gold.” — William Shakespeare&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  So who is vibe coding actually for?
&lt;/h2&gt;

&lt;p&gt;Honestly? It depends entirely on what you’re building and why.&lt;/p&gt;

&lt;p&gt;If you’re a solo founder trying to test whether your idea has legs before investing real money — vibe code away. Build it fast and don’t worry about making it perfect. Show it to ten people. If they love it, then bring in someone who can build it properly.&lt;/p&gt;

&lt;p&gt;If you’re an experienced developer who understands the code being generated and is using AI to move faster — that’s not even really vibe coding, that’s just good engineering with better tools.&lt;/p&gt;

&lt;p&gt;But if you’re building something that handles real money, real health data, real people’s privacy — please, for everyone’s sake, don’t just vibe your way through it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The bottom line
&lt;/h2&gt;

&lt;p&gt;Vibe coding is not a revolution. It’s also not a scam. It’s a tool — a genuinely powerful one — that is being wildly overhyped by people who want to believe that building software is now as easy as having a conversation. Sometimes it is. More often, it isn’t.&lt;/p&gt;

&lt;p&gt;The best way I can put it: vibe coding is like driving with GPS. It gets you there faster, and most of the time it works brilliantly. But if you’ve never learned to read a map, the day the signal drops, you’re completely lost.&lt;/p&gt;

&lt;p&gt;Learn the fundamentals. Use the AI. And always remember —&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;” There are no shortcuts to any place worth going.” — Beverly Sills&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;About the Author&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://akshatuniyal.com" rel="noopener noreferrer"&gt;Akshat Uniyal&lt;/a&gt;&lt;/strong&gt; writes about Artificial Intelligence, engineering systems, and practical technology thinking.&lt;br&gt;
Explore more articles at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>They Built Your World. Now They're Being Told They're Obsolete.</title>
      <dc:creator>Akshat Uniyal</dc:creator>
      <pubDate>Sun, 22 Mar 2026 07:01:00 +0000</pubDate>
      <link>https://forem.com/akshat_uniyal/they-built-your-world-now-theyre-being-told-theyre-obsolete-g13</link>
      <guid>https://forem.com/akshat_uniyal/they-built-your-world-now-theyre-being-told-theyre-obsolete-g13</guid>
      <description>&lt;p&gt;Originally published at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A letter to every developer sitting quietly with fear in their chest — you are not a line item to be optimized away. (A perspective on the human cost behind the AI gold rush.)&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;I've been sitting on this feeling for a while. Watching colleagues and friends go quiet. Those still standing are under daily pressure to justify their own existence against a machine. Thought it was time to say it out loud.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There is a particular kind of silence that settles over a room when someone realizes their life's work might be expiring. Not the comfortable silence of a Sunday morning, but the hollow kind — the kind that follows a sentence like "we've already automated 80% of coding in our company," delivered casually at a conference, between sips of water, by someone whose net worth would make your annual salary look like a rounding error.&lt;/p&gt;

&lt;p&gt;That silence is where millions of software developers live right now. And I think it's time someone wrote about it honestly — not as a tech forecast, not as a productivity bulletin, but as a human story.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Hands That Built Everything
&lt;/h2&gt;

&lt;p&gt;Let us be clear about something before we go further. Every app on your phone, every website you've scrolled, every payment you've tapped through, every hospital system that tracked your records, every satellite that beamed your video call across continents — a human being wrote that. Likely a sleep-deprived one, running on cold coffee and sheer stubbornness, debugging at 2am because the production server was on fire and real people were depending on it working.&lt;/p&gt;

&lt;p&gt;Developers did not just "write code." They made judgment calls. They argued over architecture. They chose the right abstraction at the right moment, not because a model predicted the next token, but because they understood the business, the user, the edge case nobody had written a ticket for. They carried entire systems in their heads. They mentored juniors. They read the room in a sprint meeting and knew when to push back.&lt;/p&gt;

&lt;p&gt;This is the community that is now being told — sometimes gently, sometimes with the bluntness of a tech billionaire's tweet — that their time is up.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The people being 'disrupted' are not abstract workers in a productivity chart. They are real humans with EMIs, with children's school fees, with aging parents — and a career they built with years of genuine sacrifice."&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Cruelty of Casual Declarations
&lt;/h2&gt;

&lt;p&gt;What makes this moment particularly painful is not just the change itself — change is inevitable, and most developers know it. What stings is the tone. When a CEO casually announces that AI writes most of their code now, what developers hear is not a business update. What they hear is: you were replaceable all along.&lt;/p&gt;

&lt;p&gt;There's a difference between saying "the landscape is shifting, let's navigate it together" and saying "coding will be dead by year-end" as if you're announcing a quarterly earnings beat. One acknowledges humanity. The other discards it. The powerful have always had the luxury of treating disruption as exciting. They rarely have to live inside it.&lt;/p&gt;

&lt;p&gt;And so the developer — already stretched thin, already quietly doubting whether they're good enough in a field that never stops moving — now opens LinkedIn every morning to find another think-piece about their own obsolescence. Another company bragging about headcount reduction. Another VC with a newsletter telling them that their value was always just syntax, and syntax is now free. Nobody announces this with cruelty. That's almost what makes it worse. It arrives like weather. And the weight of it just sits there, accumulating, day after day, with nowhere to put it.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Poem for the Ones Still at Their Desks
&lt;/h2&gt;

&lt;p&gt;// Still Compiling&lt;br&gt;
You learned the language nobody taught in school,&lt;br&gt;
sat with the errors till the errors became friends,&lt;br&gt;
pulled meaning from a blinking cursor, &lt;br&gt;
made a living from the logic that nobody sees.&lt;/p&gt;

&lt;p&gt;You carried the system home inside your head,&lt;br&gt;
dreamed in stack traces, woke to fix the build,&lt;br&gt;
your name was in no headline, but the thing you made&lt;br&gt;
was quietly keeping someone's world from falling still.&lt;/p&gt;

&lt;p&gt;Now they hold up a mirror and say: look,&lt;br&gt;
a machine does this faster. Clean. Efficient. Free.&lt;br&gt;
As if the years you spent, the craft you took&lt;br&gt;
apart and rebuilt — were just a recipe.&lt;/p&gt;

&lt;p&gt;But here's what doesn't compile in their pitch:&lt;br&gt;
a tool holds no pride, loses nothing, cares for none.&lt;br&gt;
It cannot feel the weight of getting something right&lt;br&gt;
after the tenth attempt, at 3am, alone.&lt;/p&gt;

&lt;p&gt;You were never just a resource. You were the reason&lt;br&gt;
the lights came on. Don't let them dim that — not this season.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Wise Already Knew
&lt;/h2&gt;

&lt;p&gt;History has seen this before. Looms replaced weavers. Calculators replaced human computers. Automated switchboards replaced telephone operators. And yet — human ingenuity did not end. It relocated, evolved, and found new ground. But here's the part we conveniently skip in that optimistic retelling: the transition hurt. Real families bore the cost of "progress" while those who owned the machines counted the gains.&lt;/p&gt;

&lt;p&gt;Gandhi once said, "First they ignore you, then they laugh at you, then they fight you, then you win." I keep thinking about that. The developer community right now is somewhere in the middle of that arc — being laughed at, being dismissed, being told to "just learn prompting" as if decades of craft were a minor inconvenience to be retrained over a weekend. But communities that have been underestimated have a long history of outlasting the people who underestimated them.&lt;/p&gt;

&lt;p&gt;Darwin's most misunderstood lesson wasn't about strength. It was about adaptability. Developers, of all people, know this instinctively — they've been adapting since the day they wrote their first "Hello World" in a language that was obsolete five years later. The tools changed constantly. They kept up. This is not new. What's new is that this time, the people asking them to adapt are also quietly hoping they won't need to.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"A ship in harbour is safe — but that is not what ships are for."  — John A. Shedd&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There's a vast difference between a ship choosing to sail into new waters, and a ship being scuttled at the dock by the people who commissioned it. One is evolution. The other is abandonment. And right now, too many developers are being handed an anchor and told it's a life jacket.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"What the caterpillar calls the end of the world, the master calls a butterfly."  — Richard Bach&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I want to believe that. I genuinely do. &lt;strong&gt;But that comfort belongs to the caterpillar who is given the space to transform — not to the one being told the cocoon is a performance issue.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Owe Each Other
&lt;/h2&gt;

&lt;p&gt;If you are a developer reading this: your anxiety is legitimate. Your feelings are not weakness — they are the entirely rational response of a thoughtful person confronting genuine uncertainty. You are allowed to feel threatened without being told to "just upskill" as if that costs nothing — not in time, not in money, not in the emotional labour of rebuilding your identity from scratch.&lt;/p&gt;

&lt;p&gt;If you are a leader, an executive, an investor reading this: the developers in your team are not legacy infrastructure. They are people who chose this craft because they loved it. The least you owe them is honesty, lead time, and the basic human decency of not announcing their redundancy via a tweet at a conference they weren't invited to.&lt;/p&gt;

&lt;p&gt;And if you are someone who uses technology — which is everyone, everywhere, always — remember occasionally that behind every seamless interface is a person who lost weekends to make it feel that way. That person deserves more than being phased out in a keynote slide titled "Efficiency Gains."&lt;/p&gt;

&lt;h2&gt;
  
  
  The Code Will Change. The Craft Won't.
&lt;/h2&gt;

&lt;p&gt;Tools have always changed what developers do — they have never changed why developers matter. The judgment, the empathy for the end-user, the ethical instinct about what a system shouldn't do, the ability to ask the right question before writing a single line — these are irreducibly human. AI can autocomplete. It cannot yet care.&lt;/p&gt;

&lt;p&gt;The community that built the internet, that shipped open-source software used by billions for free, that debugged other people's messes out of sheer professional solidarity — that community has more resilience than any algorithm. But resilience should not be asked of people who are given no runway, no support, and no acknowledgement of what they've already given.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;To every developer quietly carrying this weight right now —&lt;br&gt;
I see you. A lot of us do, even if we haven't said it.&lt;br&gt;
You are not obsolete. You are not a cost to be optimised.&lt;br&gt;
You are someone who chose a hard craft and gave it real years.&lt;br&gt;
&lt;strong&gt;That doesn't expire. Not in a keynote. Not in a tweet. Not ever.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;written with empathy  ·  for the builders who kept the lights on  ·  and still do&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://akshatuniyal.com" rel="noopener noreferrer"&gt;Akshat Uniyal&lt;/a&gt; writes about Artificial Intelligence, engineering systems, and practical technology thinking.&lt;br&gt;
Explore more articles at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Grok in 2026: Powerful, Polarizing, and Hard to Ignore</title>
      <dc:creator>Akshat Uniyal</dc:creator>
      <pubDate>Wed, 18 Mar 2026 07:57:15 +0000</pubDate>
      <link>https://forem.com/akshat_uniyal/grok-in-2026-powerful-polarizing-and-hard-to-ignore-5c7i</link>
      <guid>https://forem.com/akshat_uniyal/grok-in-2026-powerful-polarizing-and-hard-to-ignore-5c7i</guid>
      <description>&lt;p&gt;Originally published at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Technical progress, real-time power, and a controversy trail that still raises hard questions. Here’s where Grok actually stands.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There’s no AI story in 2026 quite like Grok’s.&lt;/p&gt;

&lt;p&gt;On paper, it is one of the most ambitious AI products in the market. Strong benchmark scores, a real-time information advantage that very few rivals can match, serious computing infrastructure, and a release cadence that barely slows down. xAI has been moving fast — sometimes faster than its critics are comfortable with.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82oxuclg3ppog7xdmqtn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82oxuclg3ppog7xdmqtn.png" alt="Grok Colossus" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Off paper, the story has been far messier. Grok has been tied to a string of controversies: harmful outputs, questions around system-level moderation choices, and image-generation incidents that triggered regulatory scrutiny in multiple countries.&lt;/p&gt;

&lt;p&gt;And yet — people keep using it. Developers keep benchmarking it. The US Department of Defense integrated it into select classified networks. xAI’s valuation climbed into the hundreds of billions. None of that happens if the model is just hype.&lt;/p&gt;

&lt;p&gt;So what is Grok, really? A serious contender with distinctive strengths, or a product still carrying unresolved trust questions? At this point, probably both. Let’s dig in.&lt;/p&gt;




&lt;h2&gt;
  
  
  From Chatbot to Colossus: How Fast Grok Has Moved
&lt;/h2&gt;

&lt;p&gt;Grok launched in November 2023 as a beta on X (formerly Twitter), accessible only to paid users. It was honest about what it was: an early product with two months of training behind it, designed to answer almost anything with a bit of wit and a rebellious streak.&lt;/p&gt;

&lt;p&gt;That version feels like ancient history now.&lt;/p&gt;

&lt;p&gt;By July 2025, xAI had released Grok 4 and Grok 4 Heavy, trained on the Colossus supercomputer cluster — at the time housing around 200,000 GPUs in Memphis, Tennessee. Grok 4 Heavy became the first model to achieve a near-passing score on &lt;strong&gt;Humanity’s Last Exam&lt;/strong&gt;, widely regarded as the hardest multi-domain benchmark ever constructed. Musk claimed on the launch stream that the model “&lt;em&gt;is smarter than almost all graduate students in all disciplines simultaneously&lt;/em&gt;.” That’s the kind of sentence that’s easy to dismiss as hype, except the benchmark results were genuinely hard to argue with.&lt;/p&gt;

&lt;p&gt;Then came the 4.x series. Grok 4.1 in November 2025 cut hallucination rates from 12% down to around 4% — a 65% reduction that meaningfully changed the enterprise conversation around the model. Grok 4.20 Beta followed in February 2026 with improved instruction following, LaTeX rendering for scientific outputs, and a multi-agent architecture. By March 2026, Grok 4.20 Beta 2 was live with five further improvements (e.g., "including enhanced vision capabilities and multi-image rendering").&lt;/p&gt;

&lt;p&gt;To put that in perspective: the pace of improvement from Grok 1 to Grok 4 Heavy is genuinely one of the more impressive model trajectories in AI right now. Very few labs have moved this fast on core capability benchmarks in such a short window.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"The pace of iteration is unusually fast, even by current frontier-model standards."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That speed comes with trade-offs, some of which we’ll get to. But from a pure capability trajectory, xAI’s progress over 18 months has been extraordinary.&lt;/p&gt;




&lt;h2&gt;
  
  
  Grok’s Clearest Edge: Real-Time Intelligence
&lt;/h2&gt;

&lt;p&gt;If there is one thing that most clearly separates Grok from other frontier models, it is this: &lt;strong&gt;it is built around what is happening right now&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Most major AI assistants still depend on a training cutoff and then use search or retrieval layers to stay current. That can work well, but it usually feels like an added layer rather than the core product experience.&lt;/p&gt;

&lt;p&gt;Grok is different in that respect. It is deeply integrated with X and can draw on a platform that produces hundreds of millions of posts each day. Breaking news, live reactions, market chatter, sports conversations, memes, and the texture of the internet in motion — this is where Grok feels unusually native.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F30wxknwolo5nitlmhhxh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F30wxknwolo5nitlmhhxh.png" alt="Grok X platform" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For certain use cases, this is a meaningful advantage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Journalists&lt;/strong&gt; and &lt;strong&gt;researchers&lt;/strong&gt; tracking breaking stories&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Market analysts&lt;/strong&gt; who need to know what people are saying about a stock, right now&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Social media managers&lt;/strong&gt; monitoring brand sentiment in real time&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Anyone who needs to understand what’s actually trending vs. what was trending last month&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Few major models offer this kind of live social-context access so natively. And it matters more than it may sound on paper. A lot of real-world information needs are time-sensitive. Being able to answer ‘what are people saying about this right now?’ is a meaningful product advantage, even if freshness does not always guarantee accuracy.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Musk Ecosystem Play
&lt;/h2&gt;

&lt;p&gt;One of the more underappreciated parts of Grok’s story is how it sits inside a much larger infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;xAI&lt;/strong&gt; was brought together with &lt;strong&gt;SpaceX&lt;/strong&gt; in February 2026, putting &lt;strong&gt;Grok&lt;/strong&gt; inside a much larger ecosystem that also touches &lt;strong&gt;Tesla&lt;/strong&gt;, &lt;strong&gt;Starlink&lt;/strong&gt;, &lt;strong&gt;Neuralink&lt;/strong&gt;, and &lt;strong&gt;X&lt;/strong&gt;. That is not just a corporate footnote. It suggests access to a broader strategic stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tesla’s fleet data&lt;/strong&gt; — millions of miles of real-world video, feeding into vision and robotics training&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Starlink’s satellite network&lt;/strong&gt; — potentially bringing AI inference to places that have never had reliable internet&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;X’s social graph&lt;/strong&gt; — the real-time pulse of global conversation&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Optimus robot integration&lt;/strong&gt; — xAI is already using Grok’s reasoning to power humanoid robots&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;US Department of Defense contracts&lt;/strong&gt; — Grok was integrated into select classified and unclassified military networks in January 2026&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmp36ieplb488wqupd99e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmp36ieplb488wqupd99e.png" alt="Musk Ecosystem" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The DoD integration is particularly notable. It represents a level of institutional trust that usually takes time to build. At the same time, it has drawn criticism from people who believe a model with Grok’s public controversy history warrants closer scrutiny before being embedded in government systems. Both realities can be true at once.&lt;/p&gt;

&lt;p&gt;There’s also the financial picture: a pre-merger valuation of around $230 billion, now part of a combined SpaceX-xAI entity valued at over $1 trillion, with backing from Nvidia, AMD, Sequoia, a16z, BlackRock, and Fidelity. That’s not a scrappy startup anymore. That’s a serious institution with the resources to match.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Very few AI companies have this kind of cross-industry data and distribution story. Whether that becomes a lasting moat or a governance headache is still an open question."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Where Grok Actually Performs Well
&lt;/h2&gt;

&lt;p&gt;Enough big picture. What does Grok actually do well in practice?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-time research and news analysis&lt;/strong&gt;&lt;br&gt;
This is probably Grok’s clearest practical strength. If your question touches something that happened recently, Grok’s X integration can give it a real edge on freshness and signal detection. The output is not always clean — X is fast, not always reliable — but in terms of immediacy, Grok is unusually strong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coding and technical reasoning&lt;/strong&gt;&lt;br&gt;
Grok 4 Heavy benchmarks exceptionally well on coding tasks. The multi-agent architecture in the 4.20 series, where multiple AI agents collaborate on complex problems, has been particularly well received by developers working on larger codebases. The hallucination reduction in 4.1 also made a meaningful difference for technical use cases where wrong answers have real costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Internet culture and tone&lt;/strong&gt;&lt;br&gt;
This sounds minor but it’s genuinely useful in practice. Grok gets internet humour, meme references, and the texture of online conversation in a way that more formally trained models sometimes miss. That makes it particularly good for content creators, social media work, and anyone who needs writing that feels alive rather than polished-but-sterile.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long-context tasks&lt;/strong&gt;&lt;br&gt;
Grok 4 supports very large context windows — in practice useful for things like feeding in entire codebases, long research papers, or extended document sets that would overwhelm smaller windows. This is becoming table stakes for frontier models, but Grok handles it well.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Reality Check: Growth Pains &amp;amp; The Safety Evolution
&lt;/h2&gt;

&lt;p&gt;Any fair assessment of Grok also has to account for the friction. xAI’s tendency to ship fast and iterate in public has come with some very visible growing pains over the last 18 months.&lt;/p&gt;

&lt;p&gt;Over the last 18 months, Grok has gone through a number of public incidents — from system prompt leaks tied to political misinformation concerns to the 2025 "MechaHitler" episode, and later the "digital undressing" controversy that drew regulatory scrutiny from the EU and UK.&lt;/p&gt;

&lt;p&gt;What is also worth noting is that xAI has not treated these issues as background noise. It has tried to translate some of those lessons into product and architecture changes.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;From Chaos to Context&lt;/strong&gt;: The 4.1 update was more than a routine patch; it was a focused attempt to improve stability, and xAI said it reduced hallucination rates by roughly 65%.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Multi-Agent Guardrail&lt;/strong&gt;: The current 4.20 series moved toward a multi-agent setup intended to add more internal checks and balances around reasoning and safety.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Institutional Vetting&lt;/strong&gt;: While regulators were asking questions, the US Department of Defense was also doing its own due diligence, eventually integrating Grok into select classified networks in early 2026. That suggests at least some institutions see the trust picture as improving, even if concerns remain.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The story of Grok is not just about a model that stumbled in public. It is also about a model being refined in one of the most visible real-world AI testing grounds. Is it perfect? No. But the pace at which xAI is trying to tighten capability and safety together is part of the story too.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Does Grok Sit in the Current AI Landscape?
&lt;/h2&gt;

&lt;p&gt;By the numbers, Grok 4 Heavy is clearly one of the strongest models in the world. The Humanity’s Last Exam performance, the hallucination reduction, the LMArena visibility — these are not imaginary. The technical progress is real.&lt;/p&gt;

&lt;p&gt;But the current AI landscape is crowded with genuinely strong models. GPT-4o remains the most versatile general-purpose assistant for most professional workflows. Claude has built a strong reputation for writing quality, long-context reasoning, and the kind of calm, deliberate approach to complex tasks that developers value. Gemini has deep Google ecosystem integration and strong multimodal performance. DeepSeek has raised questions about what’s possible at much lower cost.&lt;/p&gt;

&lt;p&gt;Grok’s clearest advantages are real-time information access and the broader Musk ecosystem around it. Its clearest concerns are around guardrails, rollout discipline, and the trust questions that come with a documented history of controversial outputs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where Grok wins&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Real-time research and social listening&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Coding and technical tasks, especially complex multi-step workflows&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Users deeply embedded in the X and Tesla ecosystems&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Applications where cultural relevance and internet-native tone matter&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;High-stakes benchmark performance in controlled environments&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Where the competition still leads&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Enterprise deployments where reliability and trust matter more than raw performance&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Long-form writing with consistent voice and quality&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Workflows requiring deep Google or Microsoft ecosystem integration&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Regulated industries where guardrail robustness is non-negotiable&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Teams where the AI safety and controversy track record is a dealbreaker&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Comes Next
&lt;/h2&gt;

&lt;p&gt;xAI has been open about its ambitions. Musk has publicly suggested a meaningful chance of reaching the world’s first AGI with upcoming models — which may prove visionary, promotional, or a bit of both. The Colossus supercomputer is reportedly continuing to scale. Grok Imagine, the video generation product, released an improved version in February 2026 with full text-to-video and video editing capabilities, positioning Grok as more than a chatbot.&lt;/p&gt;

&lt;p&gt;The SpaceX tie-up also creates a bigger strategic story: an AI company with potential access to satellite infrastructure for global inference, automotive data from one of the world’s largest vehicle fleets, and robotics integration through Optimus. Whether that becomes a durable advantage or creates larger governance challenges is still unclear.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What seems certain is that xAI will keep shipping. They’ve demonstrated that convincingly.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Grok is one of the most technically impressive and most debated AI stories of the moment.&lt;/p&gt;

&lt;p&gt;The capabilities are real. The real-time intelligence advantage is real. The benchmark performance is real. The ecosystem play is real.&lt;/p&gt;

&lt;p&gt;And the improvement arc is worth stating plainly: from a two-month-old beta in 2023 to near-passing on the hardest AI benchmark ever built in under two years. Whatever else you think about Grok, that trajectory is genuinely remarkable.&lt;/p&gt;

&lt;p&gt;So are the controversies, the guardrail questions, and the trust gap that can emerge when a model advances this quickly in public.&lt;/p&gt;

&lt;p&gt;If you need real-time intelligence, are building on X’s ecosystem, or are doing heavy technical work where raw model performance is the primary criterion — Grok deserves a serious look. It might be the best tool for your specific job.&lt;/p&gt;

&lt;p&gt;If you are building for regulated industries, enterprise environments where reliability is non-negotiable, or any setting where harmful outputs would carry serious consequences, this history deserves careful weight.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"The model closest to the live internet is also the one with the most unresolved story. And that’s exactly what makes it worth watching."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;— What’s your experience with Grok? Has it earned your trust yet, or are you still watching from the sidelines? Drop it in the comments below.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://akshatuniyal.com" rel="noopener noreferrer"&gt;Akshat Uniyal&lt;/a&gt;&lt;/strong&gt; writes about Artificial Intelligence, engineering systems, and practical technology thinking.&lt;br&gt;
Explore more articles at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>llm</category>
    </item>
    <item>
      <title>ChatGPT vs Gemini: GPT-5.4 vs Gemini 3.1 Pro — Which AI Model Is Better?</title>
      <dc:creator>Akshat Uniyal</dc:creator>
      <pubDate>Sun, 08 Mar 2026 10:43:25 +0000</pubDate>
      <link>https://forem.com/akshat_uniyal/chatgpt-vs-gemini-gpt-54-vs-gemini-31-pro-which-ai-model-is-better-503f</link>
      <guid>https://forem.com/akshat_uniyal/chatgpt-vs-gemini-gpt-54-vs-gemini-31-pro-which-ai-model-is-better-503f</guid>
      <description>&lt;p&gt;Originally published at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The AI model race is moving ridiculously fast.&lt;/p&gt;

&lt;p&gt;Every few months there’s a new release claiming to be the “most powerful model yet.” Sometimes it’s hard to keep track of what actually changed and what’s just marketing noise.&lt;/p&gt;

&lt;p&gt;Right now two of the most interesting models are:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI → GPT-5.4&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Google → Gemini 3.1 Pro&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Both are extremely capable. No question about that.&lt;/p&gt;

&lt;p&gt;But after spending some time using them side-by-side for actual work (not benchmark screenshots), one thing became clear pretty quickly:&lt;/p&gt;

&lt;p&gt;They feel &lt;strong&gt;very different&lt;/strong&gt; to use.&lt;/p&gt;




&lt;h2&gt;
  
  
  ChatGPT (GPT-5.4): The Workhorse
&lt;/h2&gt;

&lt;p&gt;GPT-5.4 feels a bit like working with a &lt;strong&gt;very competent engineer sitting next to you&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Where it tends to shine the most:&lt;/p&gt;

&lt;p&gt;• coding and debugging&lt;br&gt;
• structured reasoning&lt;br&gt;
• breaking down messy problems&lt;br&gt;
• editing or refining technical writing&lt;br&gt;
• building workflows or agents&lt;/p&gt;

&lt;p&gt;One thing I’ve noticed is how it tends to &lt;strong&gt;structure the problem before answering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you give it a messy prompt (which honestly happens a lot in real work), it often pauses a bit, organizes the problem internally, and then responds with a fairly clean breakdown.&lt;/p&gt;

&lt;p&gt;That behavior actually matters more than you might expect.&lt;/p&gt;

&lt;p&gt;Instead of feeling like a chatbot producing text, it often feels more like a &lt;strong&gt;problem-solving assistant&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not perfect of course — but surprisingly reliable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Gemini 3.1 Pro: The Multimodal Knowledge Engine
&lt;/h2&gt;

&lt;p&gt;Gemini feels a bit different.&lt;/p&gt;

&lt;p&gt;Where ChatGPT behaves like a structured thinker, Gemini often feels like a &lt;strong&gt;massive knowledge engine&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It seems particularly strong when the task involves large amounts of information or mixed input types.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;• long documents&lt;br&gt;
• multimodal inputs (text + images + video)&lt;br&gt;
• large context reasoning&lt;br&gt;
• combining information from multiple sources&lt;/p&gt;

&lt;p&gt;Another thing worth mentioning is how deeply it connects to the &lt;strong&gt;Google ecosystem&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Gemini is increasingly integrated across:&lt;/p&gt;

&lt;p&gt;• Google Docs&lt;br&gt;
• Gmail&lt;br&gt;
• Search&lt;br&gt;
• Android&lt;br&gt;
• developer tooling&lt;/p&gt;

&lt;p&gt;Because of that, it sometimes feels less like “a chatbot” and more like &lt;strong&gt;an AI layer sitting on top of Google’s products&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That’s a very different strategy compared to OpenAI.&lt;/p&gt;




&lt;h2&gt;
  
  
  Context Windows: This Part Is Honestly Wild
&lt;/h2&gt;

&lt;p&gt;One of the biggest changes in modern AI models is context size.&lt;/p&gt;

&lt;p&gt;Both GPT-5.4 and Gemini 3.1 Pro can now handle &lt;strong&gt;around 1 million tokens of context&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Which basically means you can throw things like:&lt;/p&gt;

&lt;p&gt;• entire codebases&lt;br&gt;
• long research papers&lt;br&gt;
• full reports&lt;br&gt;
• books&lt;br&gt;
• multi-hour transcripts&lt;/p&gt;

&lt;p&gt;into a single prompt.&lt;/p&gt;

&lt;p&gt;A couple of years ago this would have sounded unrealistic.&lt;/p&gt;

&lt;p&gt;Now it’s becoming fairly normal.&lt;/p&gt;

&lt;p&gt;For things like research, engineering analysis, or enterprise knowledge work, this is actually a &lt;strong&gt;pretty big deal&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Capability Comparison
&lt;/h2&gt;

&lt;p&gt;Not scientific benchmarks — just practical impressions from using both.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8y87qhgdhtztu2afyw5v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8y87qhgdhtztu2afyw5v.png" alt="chatgpt-vs-gemini-quick-capability-comparison" width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Both models are strong. They just optimize for slightly different things.&lt;/p&gt;




&lt;h2&gt;
  
  
  My Honest Takeaway
&lt;/h2&gt;

&lt;p&gt;If your work is heavily focused on:&lt;/p&gt;

&lt;p&gt;• engineering&lt;br&gt;
• coding&lt;br&gt;
• technical reasoning&lt;br&gt;
• structured problem solving&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GPT-5.4 currently feels slightly stronger.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But if your work involves:&lt;/p&gt;

&lt;p&gt;• large documents&lt;br&gt;
• multimodal inputs&lt;br&gt;
• research synthesis&lt;br&gt;
• Google ecosystem workflows&lt;/p&gt;

&lt;p&gt;then &lt;strong&gt;Gemini 3.1 Pro is extremely impressive&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Story
&lt;/h2&gt;

&lt;p&gt;The most interesting part of this comparison isn't which model wins.&lt;/p&gt;

&lt;p&gt;The real story is &lt;strong&gt;how fast the lead keeps changing&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Six months ago this comparison looked different.&lt;/p&gt;

&lt;p&gt;Six months from now it will probably look different again.&lt;/p&gt;

&lt;p&gt;The pace of change in AI right now is honestly a bit crazy.&lt;/p&gt;

&lt;p&gt;Which also makes it one of the most fascinating technology shifts to watch.&lt;/p&gt;




&lt;p&gt;If you’ve been using both recently, I’m curious:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which one actually made you more productive?&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://akshatuniyal.com" rel="noopener noreferrer"&gt;Akshat Uniyal&lt;/a&gt;&lt;/strong&gt; writes about Artificial Intelligence, engineering systems, and practical technology thinking.&lt;br&gt;
Explore more articles at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>chatgpt</category>
      <category>gemini</category>
      <category>ai</category>
      <category>llm</category>
    </item>
    <item>
      <title>Why Smart AI Teams Are Quietly Switching to Small Language Models?</title>
      <dc:creator>Akshat Uniyal</dc:creator>
      <pubDate>Thu, 05 Mar 2026 10:18:32 +0000</pubDate>
      <link>https://forem.com/akshat_uniyal/why-smart-ai-teams-are-quietly-switching-to-small-language-models-4ed7</link>
      <guid>https://forem.com/akshat_uniyal/why-smart-ai-teams-are-quietly-switching-to-small-language-models-4ed7</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The current AI landscape feels like a &lt;strong&gt;Mad Max scenario&lt;/strong&gt;. Everyone is rushing to onboard the biggest models they can afford - bigger budgets, massive parameter counts, and even bigger expectations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When tested on sandbox, these giants look incredible:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Demos are impressive&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Responses sound brilliant&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Leadership gets excited&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;And yet, when these models are moved into production, cracks start to appear.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Costs rise faster than expected&lt;/li&gt;
&lt;li&gt;Hallucinations surface&lt;/li&gt;
&lt;li&gt;Latency becomes a constant complaint&lt;/li&gt;
&lt;li&gt;Responses sound confident….but aren't always correct&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nothing "broke".&lt;/p&gt;

&lt;p&gt;The model is working exactly as it is designed and trained to do.&lt;/p&gt;

&lt;p&gt;Here's an uncomfortable truth which we keep seeing in production AI:&lt;/p&gt;

&lt;p&gt;Most AI failures aren't caused by models which are too small.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They are generally caused by models which are too big for the task.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So what exactly happened? Let's unpack it.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. General Intelligence vs Getting the Job Done
&lt;/h2&gt;

&lt;p&gt;Large Language Models are normally favoured because they are generalists.&lt;/p&gt;

&lt;p&gt;They know a little about everything.&lt;/p&gt;

&lt;p&gt;That's great for exploration but risky for execution.&lt;/p&gt;

&lt;p&gt;In the real business workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;"mostly correct" is still wrong&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Hallucinations don't show in demos - but they creep later into production&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Small Language Models take a different approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Narrow scope&lt;/li&gt;
&lt;li&gt;Task or Domain specific training&lt;/li&gt;
&lt;li&gt;Built- in guardrails, which means fewer surprises&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While LLMs are "&lt;strong&gt;Jack of all trades&lt;/strong&gt;", SLMs can be trained on high value data sets to become "expert" in a specific field.&lt;/p&gt;

&lt;p&gt;In general, most enterprise use cases don't need creativity.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;They just need accuracy that works every single time.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  2. The Hidden Tax Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;In comparison to LLMs, SLMs require fewer computational resources. They train faster and run efficiently on commodity hardware rather than requiring massive H100 clusters.&lt;/p&gt;

&lt;p&gt;LLMs don't just cost more - they behave differently when scaled.&lt;/p&gt;

&lt;p&gt;Something running confidently in sandbox can become painful in production quickly once:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Usage increases&lt;/li&gt;
&lt;li&gt;Latency hits client-facing flows&lt;/li&gt;
&lt;li&gt;Accounting starts asking difficult questions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SLMs shine here because they are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cost efficient (cheaper per request)&lt;/li&gt;
&lt;li&gt;Faster to run&lt;/li&gt;
&lt;li&gt;Easy to deploy and scale&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;When AI moves from experiment to architecture, economics start to matter more than capability.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  3. Why Control Matters More Than Raw Intelligence
&lt;/h2&gt;

&lt;p&gt;LLMs are powerful but they are harder to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Control&lt;/li&gt;
&lt;li&gt;Debug&lt;/li&gt;
&lt;li&gt;Predict&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In comparison, SLMs are easier to live with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fine-tuning is practical&lt;/li&gt;
&lt;li&gt;Outputs are more stable&lt;/li&gt;
&lt;li&gt;Evaluation and guardrails actually work&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;Trust in AI doesn't come from intelligence. It comes from predictability.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  4. Production AI Isn't One Big Brain
&lt;/h2&gt;

&lt;p&gt;The most effective AI systems, don't rely on a single massive model.&lt;/p&gt;

&lt;p&gt;They're built as a combination of multiple models, each with a clearly defined task.&lt;/p&gt;

&lt;p&gt;SLMs perfectly fit this architecture.&lt;/p&gt;

&lt;p&gt;They can be easily swapped, upgraded and tested without breaking everything else.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;LLMs still have a role - but as escalation layer not default engines.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;So are LLMs bad?… &lt;strong&gt;NO!&lt;/strong&gt; The problem I want to emphasize here is that we shouldn't keep using them where they don't belong.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;Trying to hammer a nail with a wrench doesn't make the wrench bad - it makes the tool selection wrong.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;High performing teams today aren't asking:&lt;/p&gt;

&lt;p&gt;"What's the powerful model we can use?"&lt;/p&gt;

&lt;p&gt;They are asking instead:&lt;/p&gt;

&lt;p&gt;"What's the smallest model we can use that reliably solves the problem?"&lt;/p&gt;

&lt;p&gt;Because in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Predictability beats intelligence&lt;/li&gt;
&lt;li&gt;Systems beat models&lt;/li&gt;
&lt;li&gt;Control beats capability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Bigger isn't better&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Small isn't better&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;The right model, for the right job, is better.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;What do you think - will the future of AI belong to massive models, or smarter smaller ones?&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  About the Author
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://akshatuniyal.com" rel="noopener noreferrer"&gt;Akshat Uniyal&lt;/a&gt;&lt;/strong&gt; writes about Artificial Intelligence, engineering systems, and practical technology thinking.&lt;br&gt;
Explore more articles at &lt;a href="https://blog.akshatuniyal.com" rel="noopener noreferrer"&gt;https://blog.akshatuniyal.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>machinelearning</category>
      <category>techtalks</category>
    </item>
  </channel>
</rss>
