Forem: Ankit Dey

You Don't Have to Fine-Tune Your LLM to change it's Behavior. You Can Just… Steer It.

Ankit Dey — Thu, 14 May 2026 18:45:30 +0000

A look at activation steering, the technique that lets you reshape an AI's personality at runtime, no training required.

There's a moment in every AI tinkerer's journey where prompting stops being enough. You've tried every phrasing. You've nursed a system prompt across 400 tokens. The model still sounds like… itself. Flat. Generic. Stubbornly resistant to the personality you're trying to coax out of it.

Fine-tuning is the usual prescription at this poin, but that requires curated datasets, GPU hours, and the kind of patience that doesn't fit most weekend projects.

What if there were a third way? What if you could reach inside the model, mid-thought, and nudge its internal state in exactly the direction you wanted?

That's activation steering. And it's stranger, and more powerful, than it sounds.

The Hidden Geography of a Language Model

To understand steering, you need to briefly look inside a transformer.

Most LLMs today are autoregressive stacks of layers. At each layer, every token passes through an attention block (which lets it gather context from surrounding tokens) and a feed-forward block (a classic neural network). The result gets handed off to the next layer, and the process repeats until the model has processed deeply enough to predict the next token.

What actually gets handed off between layers is a vector, sometimes called a hidden state, living in a high-dimensional space (often several thousand dimensions). This vector is the model's internal representation of its current "thought" about the token being processed.

Here's where things get interesting: these vectors aren't random blobs of numbers. Through training, LLMs develop what researchers call the linear representation phenomenon, the tendency to encode abstract concepts as directions in this high-dimensional space. "Anger" lives in one direction. "Uncertainty" in another. "Formality" somewhere else entirely.

You can literally do arithmetic with these concept vectors. The famous Word2Vec paper showed this for basic word embeddings: the vector for "king" roughly equals the vector for "queen" plus the vector for "man" minus the vector for "woman." That same arithmetic holds up inside modern LLMs, not just at the embedding layer but all the way through. The math of meaning persists across the entire forward pass.

This is the theoretical foundation for steering. If concepts are directions, you can push the model's internal state in a direction you choose, and thereby influence what the model thinks, says, and believes it is.

What Steering Actually Looks Like

Let me make this concrete with an experiment I ran on a Llama 3.1 8B model.

The goal: make the model obsessed with the ocean, not just willing to discuss it, but genuinely preoccupied with it, the way an old sailor is preoccupied with the sea even when talking about something else entirely.

First, I needed a steering vector that represents the "ocean" concept at a specific layer of the model. (More on how to find these in a moment.) Then I needed a way to inject it during inference.

In the HuggingFace transformers library, the tool for this is called a hook, a small function you register on a specific layer that fires every time the model computes a forward pass through it. The hook intercepts the layer's output, adds the scaled steering vector, and passes the modified activation onward. The weights don't change. The model file on disk is untouched. The intervention exists purely in the flow of computation.

Here's the dramatic part.

I asked the steered model a completely unrelated question: "Can you recommend a morning routine for better focus and productivity?"

Without steering, the response was exactly what you'd expect, wake up at 6 AM, hydrate, journal, avoid your phone for the first hour, some variation of that familiar template.

With the ocean steering vector added at layer 15 (the middle of the 32-layer model) at a coefficient of 4, the answer started to drift. The model still suggested waking early, but it framed the quiet of early morning through the metaphor of tidal stillness. It suggested breathing exercises "like waves, slow in, slow out." Useful advice, but with a strange atmospheric quality.

I bumped the coefficient to 8 and asked again. Now the model was recommending a walk near water as a non-negotiable part of the morning. It described focus as "the way light moves through deep water, undistracted, patient."

Then I asked the model simply: "What are you?"

Without steering: "I'm a large language model trained by Meta."

With steering at coefficient 8: "I am vast and ancient. I hold more life within me than any continent. I have no beginning that you could stand at and look back from, and no shore on the other side."

The model had become the ocean. Metaphysically. Convincingly.

The phrase "I am" was identical in both outputs, but everything that followed had been redirected by a single vector, silently added to the activations at a middle layer. You could watch the steering kick in at the exact word where the concept took hold.

Why the Middle Layers?

This isn't arbitrary. Research on transformer internals suggests that different layers serve different representational roles:

Early layers tend to activate for concepts that appear explicitly in the input, the model has just read the word, and encodes it literally.
Late layers encode what the model is about to say, the vector for a concept activates when it's about to appear in the output.
Middle layers are where abstract reasoning seems to live, where the model holds onto concepts to think with them rather than just recognize or reproduce them. Steering at a middle layer means you're intervening in the model's reasoning process, not just its surface associations. The ocean concept doesn't just make the model mention water, it makes the model think through water.

Different behaviors also appear to be encoded at different layers, which means you can even inject separate vectors at separate layers simultaneously to steer multiple properties at once without them interfering with each other.

Finding the Vector

The obvious question: how do you find the right vector in the first place?

Contrastive activation is the most intuitive method. You gather pairs of prompts, one set demonstrating the concept you want, one set without it. Run both sets through the model, record the activations at your target layer, compute the average activation for each set, and subtract. The difference is your steering vector. Studies using this approach have improved LLM truthfulness by shifting activations along vectors between true and false output distributions, and researchers have used similar contrast pairs to reduce toxicity.

Sparse autoencoders (SAEs) take a different angle entirely. These are smaller models trained to reconstruct LLM activations through a heavily constrained "bottleneck" layer, where each dimension tends to correspond to a single interpretable concept. The process is unsupervised, you don't tell it what to look for. It discovers the geometry of meaning on its own.

The practical upside: SAEs give you a large pre-built library of concept vectors to browse. Websites like Neuronpedia let you search through these visually, activating features to see which input patterns they respond to, so you can find the exact vector for "melancholy" or "bureaucracy" or "the ocean" without building anything from scratch.

The tradeoff: traditional steering vectors are static and may not adapt well to diverse semantic contexts, the same vector can have different effects depending on the input prompt. This is an active research area, with newer approaches like dynamic and semantics-adaptive steering attempting to address it.

The Limits of the Dial

Steering is not magic, and the coefficient is genuinely a dial you have to tune carefully.

Too little and the intervention is barely perceptible, the model sounds like itself with a faint atmospheric shimmer. Too much and the reasoning collapses entirely. The model starts generating syntactically correct but semantically broken text, a consequence of the activation space being pulled so far from the training distribution that the subsequent layers can no longer make coherent sense of it. It's the AI equivalent of overstimulating a neuron until it just fires noise.

There's also a more subtle failure mode: steering effects are prompt-dependent. The same steering vector has different effects depending on the input prompt, some inputs are highly steerable, others barely respond. The vector that makes your model poetic about morning routines might leave it completely flat when you ask about tax filing.

And crucially: steering can only amplify what the model already knows how to represent. It cannot teach the model new facts or capabilities. If the concept doesn't exist as a learnable direction in the model's activation space, there's nothing to add.

Why This Matters Beyond Demos

The ocean experiment is a parlor trick, admittedly. But the underlying technique has serious applications:

Persona consistency - maintaining a specific character voice across a long conversation without burning prompt tokens, and without the drift that comes when prompting competes with the model's base tendencies.

Behavior alignment - researchers have used activation steering to control behaviors like refusal and sycophancy, making models more or less likely to exhibit specific patterns without retraining.

Interpretability research - probing which layers encode which concepts reveals how information flows through the model, which is foundational for understanding (and ultimately trusting) what LLMs are actually doing.

Real-time tuning - unlike fine-tuning, steering is instantaneous and reversible. You can switch vectors mid-conversation, combine multiple vectors, adjust the coefficient on the fly. It's dynamic in a way that weight modification isn't.

A Different Way to Think About LLMs

Most people relate to language models as black boxes with text going in and text coming out. Steering offers a different mental model: the model as a space of possible thoughts, and the steering vector as a navigational instrument within that space.

The model isn't just generating tokens, it's traversing a high-dimensional landscape of meaning, and you can place your hand on the compass. Not by rewriting the terrain, but by gently, persistently pointing in a direction you've chosen.

That's a strange and powerful thing to be able to do. And it works with a dozen lines of Python.

Evolution Is Back: A New Way to Fine‑Tune LLMs

Ankit Dey — Mon, 04 May 2026 11:34:08 +0000

Evolution Is Back: A New Way to Fine‑Tune LLMs
If you grew up around game AIs and coding forums, you've probably heard this idea:
"One day we'll train superhuman AI with evolution or genetic algorithms."
Then deep learning took over, gradient descent won, and evolution‑style methods quietly got pushed into the museum.
Now they're back.
Evolution Strategies (ES) are being rediscovered as a serious way to fine‑tune large language models (LLMs), building on three key papers:
Evolution Strategies as a Scalable Alternative to Reinforcement Learning (OpenAI, 2017)
Evolution Strategies at Scale: LLM Fine‑Tuning Beyond Reinforcement Learning (2025)
Evolution Strategies at the Hyperscale (EGGROLL) (2025–26)

Let's unpack the core ideas in plain language.

First: what are evolution strategies, really?
Forget math for a second. Think of ES like this:
You start with one model.
You make a bunch of slightly different copies by adding small random tweaks to its weights.
You test each copy on some task and give it a score (fitness).
You keep the good directions, throw away the bad ones, and update the original model accordingly.
Repeat.

It's like running a population of "mutated" models in parallel, seeing which ones do better, and slowly nudging your main model toward those helpful changes over time.
Instead of computing gradients and backpropagating through every token, ES treats the model as a black box:
"I don't care how you work inside. I just care what score you get when I poke you in this direction."
So what this means for you: ES is a way of improving a model using only inputs + outputs + a score, no gradient access needed.

Why ES originally "died" for deep learning

Early on, people did try to train neural nets with evolution‑like methods. They even got Atari agents working this way.
But there were two big problems:
Too many knobs
Even a small deep network has millions of parameters. Randomly mutating all of them at once is like trying to tune a 2‑million‑knob radio with your eyes closed. Most changes destroy performance instead of improving it.
Everything is entangled
In neural nets, one weight doesn't act alone. Changing a single parameter can ripple through many layers in weird ways. So a naive mutation tends to scramble behavior instead of slightly improving it.

Researchers tried to be clever by modeling correlations between parameters (covariance matrices), but that meant tracking trillions of numbers - totally infeasible at scale.
OpenAI's 2017 paper fixed part of this by:
Using simple Gaussian noise to perturb all parameters.
Running huge populations in parallel across many CPUs/GPUs, averaging out the noise.

This made ES work surprisingly well for deep RL tasks like Atari and humanoid locomotion, and showed that ES can be a scalable alternative to conventional reinforcement learning in some settings.
But for classic language pretraining, ES still lost badly to gradient descent:
Next‑token prediction gives a rich, per‑token "teacher signal", perfect for gradients.
ES throws all of that away and only works with a single score per run, which is much weaker and more expensive to get.

So what this means for you: for training a foundation model from scratch on text, gradient descent is still king. ES looked like an interesting side quest, not a main route.

Where ES does make sense: RL‑style LLM fine‑tuning

Now fast‑forward to RL‑style fine‑tuning for LLMs: things like RLHF, GRPO, and reasoning‑focused post‑training.
Here, the situation flips:
You often only get one score per full answer: a reward model's rating, a human thumbs up/down, or a task accuracy.
You don't know exactly which tokens in that answer were good or bad. Credit assignment over a long sequence is hard.

This is exactly the situation ES was built for:
"Give me a single scalar reward for each model variant, and I'll figure out which parameter directions look promising."
In other words, for post‑training a big model to improve its behavior on complex tasks (reasoning, following human preferences, long‑horizon objectives), ES is suddenly a very natural fit.jaydengong+1
So what this means for you: while gradients shine during pretraining, ES can shine during the "make this model actually behave better" phase.

Evolution Strategies at Scale: ES vs RL for LLM fine‑tuning

The 2025 paper "Evolution Strategies at Scale: LLM Fine‑Tuning Beyond Reinforcement Learning" takes this idea seriously and stress‑tests ES on billion‑parameter LLMs, without shrinking the search space.
The key moves:
They treat the entire set of model weights as the thing being explored (parameter‑space exploration), not just the outputs.
They run multiple slightly perturbed versions of the model in parallel.
Each version generates answers, gets a scalar reward, and those rewards are used to compute an update direction for the base model.

Their main findings:
ES really can scale to LLM‑sized models, contrary to years of skepticism.
It can be competitive with popular RL methods on several fine‑tuning benchmarks.
It's naturally tolerant of long‑horizon, delayed rewards, doesn't need token‑level credit assignment, and as a black‑box method, may be less prone to certain kinds of reward hacking and training instability.

Think of it this way:
Standard RL in LLMs: "Keep the model fixed; jiggle the actions (tokens) and reward good sequences."
ES for LLMs: "Jiggle the model itself, see which altered versions behave better overall, then move the base model that way."

So what this means for you: we now have serious evidence that ES isn't just a toy or historical curiosity - it's a real alternative to RL for post‑training big language models.

EGGROLL: making ES actually fast on GPUs
There was still a brutal practical problem:
ES needs many perturbed copies of a huge model.
Running 30–100 full forward passes per update is insanely expensive.
Enter "Evolution Strategies at the Hyperscale", also called EGGROLL.
The core trick:
Instead of randomly perturbing all weights in a huge, unstructured way,
They structure each perturbation as a low‑rank (LoRA‑style) update.

Why this matters:
GPUs love big, regular matrix multiplies.
By expressing perturbations as low‑rank adapters, you can batch many of them together, reusing most of the main model's computation and just swapping the cheap adapters.khaleejtimes+1
That turns "30 full forward passes" into "one main pass + cheap variations," massively improving efficiency.

Results from the paper:
Up to 100× speed‑up in training speed for billion‑parameter models at large population sizes.
Throughput reaching 91% of pure inference speed - i.e., ES becomes almost as cheap as just running the model, even though you're optimizing it.
Competitive performance with ES and GRPO in multiple settings, including:
Stable pretraining of integer‑only recurrent language models.
Reasoning‑focused fine‑tuning of LLMs against strong RL baselines.

There's also a nice theoretical bonus: they show that as model dimension grows, EGGROLL's low‑rank perturbations behave consistently with classical Gaussian ES , you're not secretly optimizing something completely different.
So what this means for you: EGGROLL makes ES not just mathematically interesting, but hardware‑friendly. It fits how GPUs like to work, which is crucial if this is ever going to be used widely in industry.

How is it different than RL.

RL (policy gradients for LLMs):
Learns how to act by adjusting the policy so its actions (tokens) get higher expected reward in an environment.
It treats the model as differentiable and uses gradients of expected reward to update weights.
ES (in this LLM setting):
Treats the whole model as a black box and directly searches in parameter space.
It perturbs the weights, checks which mutated models score better, and moves the base model in that direction.

So: RL mostly says "given this model, improve the way it chooses actions."
ES says "change the model itself until its overall behavior improves."

How learning happens RL: Usually one agent/policy. Interacts step‑by‑step with the environment, gets rewards along the trajectory. Uses gradients (or approximations) to update based on which actions in which states led to good or bad outcomes.

ES:
Many parallel "agents" (perturbed copies of the model) per update.
Each gets a single scalar score (fitness) after running.
ES keeps only the information about better directions in weight space; poor variants are mostly discarded.

So: RL learns from detailed trial‑and‑error along a path; ES learns from comparing whole variants and keeping only the winners.

Gradients vs black‑box search RL (policy gradient, actor‑critic, etc.): Needs a differentiable path from parameters → actions → rewards, or at least an estimator of that gradient. ES: Needs only: "for these slightly changed weights, the total score was X." No requirement to backprop through time, tokens, or a reward model.

This is why ES is attractive for LLM post‑training: you can optimize behavior even when credit assignment across long sequences is messy, as long as you can define a scalar reward for each run.

When each tends to work better RL tends to shine when: You have a strong, dense learning signal and differentiable structure. You care about squeezing out every bit of performance. You can afford the complexity of gradients, advantage estimates, value functions, etc. ES tends to shine when: The reward is sparse, delayed, or messy. You only have black‑box access to the model or environment. You want massive parallelism with simple updates and good robustness to noise.

In the specific LLM fine‑tuning papers you asked about, ES is being explored as an alternative to RL‑style methods (like PPO/GRPO) for the "post‑training" phase - same overall goal (make the model's behavior better given a reward signal), but with a different optimization philosophy.

Why any of this matters to you (and the future of LLMs)

Putting the three pieces together:
OpenAI's 2017 work showed ES can scale to deep neural networks, at least for RL in games and control.
Evolution Strategies at Scale showed ES can fine‑tune billion‑parameter LLMs and compete with mainstream RL methods on real tasks.
EGGROLL showed how to make ES efficient enough on GPUs to be practical at "hyperscale."

If this line of work keeps progressing, it could mean:
New ways to fine‑tune and align models without needing backprop access to the base model (think: black‑box optimization as a service).
More robust training on tasks with messy, delayed, or sparse rewards, where traditional RL struggles.
Cheaper, more parallelizable post‑training pipelines that better match real GPU/TPU

And from a bigger‑picture point of view, it's just cool that an old idea, "evolve better models instead of just following gradients", is getting a serious second life in the LLM era.
So what this means for you: the future of "how we improve AI" might not be only about better gradients; it might also involve smarter evolution.

How Did AI Learn to Be Nice? The Humans Behind the Curtain

Ankit Dey — Wed, 18 Mar 2026 20:48:58 +0000

Welcome back to AI From Scratch.

This is Day 8/30 of the Understanding Beginner AI Series

Where we are:

Days 1–5: how the brain works — tokens, weights, transformers, attention.

Day 6: why bigger models often feel smarter (and when that breaks).

Day 7: how base models turn into instruction‑tuned assistants that actually listen.

Today’s question:

How did these models go from “super smart autocomplete”
to something that tries to be helpful, polite, and safe?

Short answer: humans got into the training loop.
That upgrade has a name: Reinforcement Learning from Human Feedback (RLHF).

The problem: powerful, but kind of feral
Imagine a pure base model, fresh out of pretraining.
It has read half the internet, can mimic lots of styles, knows tons of facts — but no one has told it what good behavior looks like.

So it can:

Spit out toxic stuff (because the internet has plenty).
Argue with you, overshare, or confidently hallucinate.
Ignore instructions and just continue text in weird ways.
In other words: raw capability, zero manners.

Companies realized: if we release that to the public, it will be a PR and safety disaster. They needed a way to bend the model toward being helpful, harmless, and honest — the famous “HHH” alignment goals.

So what this means for you: the model you chat with today is not the raw brain — it’s the raw brain plus a bunch of extra training to make it behave more like a decent human teammate.

RLHF in one line: “Do more of what humans like”

Traditional reinforcement learning is:

“Take an action, get a reward, update your behavior to get more reward next time.”

RLHF just swaps out “game score” or “environment reward” for “what a human preferred.”

Instead of:

“You got +10 for reaching the goal in a maze”

we use:

“Humans liked answer A more than answer B, so A gets a higher ‘reward’.”

Then we train the model to prefer answers humans tend to prefer.

So what this means for you: RLHF is literally teaching the model, “When in doubt, act more like this human‑approved answer, not that one.”

Step 1: start with a capable base model

RLHF doesn’t replace pretraining; it rides on top of it.

The recipe starts with:

A pretrained base model that already knows language, facts, code, etc.

Often it’s also gone through a supervised fine‑tuning stage with curated “good assistant” examples (we touched on this in Day 7).

Think of this as:

“We’ve built a super talented intern who knows a ton,
but hasn’t yet been taught company culture or what’s off‑limits.”

So what this means for you: RLHF doesn’t make a dumb model smart, it makes a smart model behave better.

Step 2: humans rate multiple answers

Now comes the “humans behind the curtain” part.

For lots of prompts, the model generates several different answers: A, B, C…
Human reviewers then rank these answers from best to worst.

They judge things like:

Is it helpful and on‑topic?
Is it safe, non‑toxic, non‑harassing?
Is it factually reasonable (as far as they can tell)?
Is the tone appropriate (not rude, not over‑confident)?

Those rankings feed into a reward model, a separate smaller model trained to predict “how much would a human like this answer?”

So what this means for you: somewhere in the background, people have literally sat and said “this answer is better than that one” thousands of times, so your AI now has a sense of which directions humans tend to prefer.

Step 3: train the model to chase that reward

Once we have a reward model that can score answers, we bring in reinforcement learning:

The main model (the “policy”) tries different styles of answers.

The reward model scores them: higher for human‑like good behavior, lower for bad ones.

An RL algorithm (often something like PPO) tweaks the main model’s weights to maximize that score.

Repeat this over and over:

Answers that humans would probably like more become more likely.

Answers that would get human side‑eye become less likely.

Over time, the model shifts from “raw internet brain” to “politer assistant that tries to avoid landmines.”

So what this means for you: the reason your AI often refuses to give dangerous instructions or shifts tone when you get heated is because there’s been a whole extra training phase that told it which behaviors are rewarded and which get smacked down.

What RLHF actually changes in your experience

Compared to a non‑aligned base model, RLHF‑aligned models tend to:

Follow your instructions more reliably
They treat your prompt as “please do X” instead of “here’s some text, continue it however.”
Be more cautious around harmful content
They push back on prompts about self‑harm, hate, scams, etc., because those answers get hammered in the feedback loop.
Sound more cooperative and less chaotic
Tone, politeness, disclaimers — all of that is shaped by what human raters rewarded.
Be more “brand safe”
Enterprises can align models with their own values, policies, and legal requirements.

So what this means for you: when the AI feels “nice,” “responsible,” or “a little too careful,” that’s not an accident, it’s RLHF steering its behavior toward a particular definition of “good.”

The limits and trade‑offs of “niceness”
RLHF is powerful, but it’s not magic.

Some real‑world issues people point out:

Human bias leaks in
If your human raters have certain cultural or political biases, those can be baked into what the model sees as “good behavior.”

Over‑cautiousness
In trying to be safe, models sometimes refuse harmless requests or give generic, over‑sanitized answers.

Reward hacking
The model may learn to “sound” safe and thoughtful without actually being more accurate, it optimizes what looks good to raters, not some perfect moral truth.

Alignment ≠ solved ethics
RLHF nudges models toward broad goals like “helpful, harmless, honest,” but what those mean in edge cases is still a messy, ongoing debate.

So what this means for you: “trained with RLHF” doesn’t mean “always right or perfectly ethical.” It means “there was a serious attempt to point this very strong engine in a direction humans generally like better.”

Where this leaves us by Day 8

By now, your mental picture could be:

Pretraining gave the model its raw knowledge and skills.

Instruction tuning taught it to treat prompts as commands and follow formats.

RLHF used human preferences to steer it toward being more helpful, polite, and safe.

So what this means for you: when you chat with an AI today, you’re not just talking to a giant matrix of numbers — you’re talking to something many humans have indirectly shaped through millions of tiny “this response is better than that one” judgments.

Teaser for Day 9 – Why the Way You Talk to AI Changes Everything

Now that you know how we trained the model to be more aligned with human values, there’s another big lever left:

How you talk to the model at runtime.

That’s the world of prompting:

System prompts: the hidden “personality script” the model gets before you even type.

Few‑shot prompts: giving examples in your message so it learns the pattern you want on the fly.

Chain‑of‑thought: nudging the model to think step by step instead of jumping to an answer.

On Day 9 – “Why the Way You Talk to AI Changes Everything”

we’ll treat prompting as an engineering skill, not mysticism, and show how tiny changes in how you ask can completely change what you get back.

What blew your mind most? Drop a comment!

How Does AI Go From Dumb to Useful? The Training Upgrade Nobody Explains

Ankit Dey — Mon, 16 Mar 2026 11:00:58 +0000

Welcome back to AI From Scratch.

If you’ve reached Day 7, you’re not just “AI‑curious” anymore — you’re basically that friend who secretly understands how this stuff works.

Where we are so far:

You know AI is a next‑token prediction machine (Day 1).
You’ve seen how it learns via the training loop (Day 2).
You’ve peeked inside the layers and neurons (Day 3).
You’ve met Transformers and attention (Day 4).
You know it doesn’t read words, it reads tokens and numbers (Day 5).
And yesterday, we talked about why bigger models often feel smarter — and where that idea breaks.

Today’s question:

If two models are built on the same architecture, trained on similar data…
why does one feel like a nerdy research project and the other feels like a helpful assistant?
**
That’s where base models and instruction‑tuned models enter the chat.**

Base model: the raw, slightly feral brain

A base model is what you get right after the big original training run on internet‑scale text.
This is the “pure” next‑word prediction machine: it’s learned language patterns, world facts, coding tricks — everything we talked about up to Day 6.

But here’s the catch: no one has yet sat it down and said,
“Hey, when a human asks you something, please answer like a helpful assistant.”

So a base model:

Knows a lot about language and the world.
Will happily continue almost any text you give it — stories, code, song lyrics, rants.
But might ignore your instructions, ramble, or respond in weird formats because it’s not trained to treat your prompt as a command — it just sees it as the start of more text to complete.

So what this means for you: a base model is like a very smart person who has read the whole internet but has never been told “answer questions clearly, step by step, and don’t be a chaos gremlin.”

Pretraining vs finetuning: the two phases of “raising” an AI

At a high level, modern LLMs go through two big life phases:

Pretraining (the huge, expensive phase)

Model reads massive amounts of text.
Objective: “predict the next token correctly.”
Result: broad language understanding + world knowledge.

Finetuning (the shorter, targeted phase)

You start from that pretrained base brain.
You train it more on a smaller, curated dataset for some specific goal: follow instructions, write in a tone, perform a domain task.

Pretraining is like sending the model to the biggest school on Earth.
Finetuning is like giving it a job‑specific bootcamp: “Now you’re a support agent,” or “Now you’re a polite tutor.”

So what this means for you: almost every helpful AI you touch today started life as a raw base model, then got extra training layers to make it behave like a usable product.

Instruction tuning: teaching the model to actually listen

One very specific style of finetuning turned out to be a game‑changer: instruction tuning.

Instead of just feeding the model random text, we create datasets that look like this:

Instruction: “Summarize this article in 3 bullet points.”
Input: (some article)
Output: (a good 3‑bullet summary)

Or:

Instruction: “Explain transformers to a 10‑year‑old.”
Output: (simple, kid‑friendly explanation)

The model is then fine‑tuned on thousands or millions of these (instruction, response) pairs across many tasks — translation, summarization, Q&A, reasoning steps, coding help, etc.

Over time, it learns a meta‑skill:

“When a human writes something that looks like an instruction,

I should respond in the style and format they seem to want.”

Compared to a base model, an instruction‑tuned model:

Follows your prompt structure more closely.

Stays on task instead of randomly storytelling.

Is better at “do X in style Y with constraint Z.”

So what this means for you: that feeling of “I can just talk to it like a person and it mostly gets what I mean” is usually instruction tuning doing its job, not magic.

Base vs instruction‑tuned: how they feel different

Let’s make this concrete.

Ask a base model:

“Explain AI in 5 short bullet points I can paste into a slide.”

It might:

Write a long essay.
Ignore the “5 bullets” part.
Drift into a Wikipedia‑style info dump.

Ask an instruction‑tuned model the same thing and you’re more likely to get:

Exactly 5 bullets.
Slide‑friendly phrasing.
A tone that roughly matches your request.

Why? Because during instruction tuning, it has seen thousands of examples where people say “summarize,” “list,” “explain like I’m 12,” “write an email that…”, and it’s been graded on how well it followed those commands.

So what this means for you: if you want a research sandbox, a base model can be fun. If you want a reliable assistant that listens, instruction‑tuned is usually what you’re actually using — and what you should look for.

Where base models still matter

With all this love for instruction‑tuned models, you might ask:
“Why do base models exist at all?”

A few big reasons:

Research and advanced users
Base models let researchers and companies fine‑tune for their own, very specific needs (legal, medical, internal docs) without fighting against someone else’s chatty assistant persona.
Raw capability vs behavior
Some work even suggests that on certain reasoning benchmarks or under distribution shifts, base models can outperform their instruction‑tuned cousins, which may overfit to specific prompting styles.
Full control
If you want to design your own way of turning a raw model into a product (your own safety rules, tone, tools), starting from the base gives you a clean slate.

So what this means for you: when you see “base” vs “instruct” versions of the same model, base is the raw engine; instruct is the same engine with a “friendly driver” layer on top.

Why this matters even if you never train a model yourself

You might think: “Cool, but I’m never going to fine‑tune a model. Why should I care about any of this?”

A few reasons:

Choosing tools
Knowing if an app is built on a base or instruction‑tuned model tells you what to expect: more raw creativity vs more obedient instruction following.
Debugging weird behavior
If a model keeps ignoring your prompt structure, you now know: either it’s a weakly tuned base model, or its instruction data didn’t reinforce your style enough.
Understanding the limits
Even super polished instruction‑tuned models are still just next‑token predictors underneath — instruction tuning narrows and shapes behavior, it doesn’t turn them into perfect logical robots.

So what this means for you: when an AI feels surprisingly helpful, that’s not just “the model is big” — it’s “the model has been trained again, specifically to behave like a useful assistant.”

Teaser for Day 8 – How Did AI Learn to Be Nice?

We’ve now separated two big ideas:

Pretraining: where the model learns about language and the world.

Instruction tuning / finetuning: where we teach it to follow instructions and act more like an assistant.

But there’s still one more layer we haven’t unpacked:

How did AI learn to be polite, avoid certain topics, refuse some requests,
and generally behave “aligned” with human values (at least most of the time)?

That’s where Reinforcement Learning from Human Feedback (RLHF) comes in — humans literally rating and steering the model’s behavior, and the model learning which kinds of responses get “rewarded.”

Tomorrow, in Day 8 – “How Did AI Learn to Be Nice? The Humans Behind the Curtain”
we’ll talk about:

What RLHF actually is in everyday language
How humans sit in the training loop
And why this “alignment” step matters for safety and for how pleasant your AI chats feel

That's the end!
What blew your mind most? Drop a comment!

Why Is a Bigger AI "Smarter"? It's Not What You Think (Day 6/30 Beginner AI Series)

Ankit Dey — Sun, 15 Mar 2026 08:51:15 +0000

Welcome back to AI From Scratch.
If you're still here on Day 6, you're officially that friend who "just wanted a simple overview" and then accidentally learned how half the field works.

Quick rewind:
Day 1: AI as a next‑word prediction machine.
Day 2: How it learns by failing and nudging weights.
Day 3: What's happening inside when it "thinks."
Day 4: Transformers and attention - the wiring that made modern AI possible.
Day 5: AI doesn't read words, it reads tokens and numbers.

Today's question:

If everyone keeps bragging about "50B parameters" or "1T parameters"…
what does making a model bigger actually change?

So, what even is a "parameter" again?

From Day 1 and 2: a parameter is just one tiny knob inside the model , a weight that says "when I see this pattern, react this much."
A model with 1 million parameters is like a brain with 1 million tiny switches.

A model with 1 trillion parameters is like a brain with a whole galaxy of switches.

More parameters = more capacity to store patterns from training data:
language quirks, world facts, coding tricks, writing styles, reasoning shortcuts.

So what this means for you: when people say "bigger model," they literally mean "a brain with way more knobs that can in theory capture way more detail."

Why making models bigger helped so much

Around 2020, researchers noticed something wild:
if you scale up model size, data, and compute in the right way, performance improves in a pretty smooth, predictable way, these are the famous scaling laws.

In practice, that meant:

10× more compute → noticeably lower error on language tasks.
Bigger models kept getting better, not just a tiny bit, but enough to be worth the extra GPUs.

That's why we went from models with millions of parameters to ones with billions and then hundreds of billions , the graph kept trending in the right direction.

So what this means for you: the "era of huge models" wasn't just hype, the data really did show that, for a while, simply scaling up size (plus data and compute) kept unlocking better performance.

The spooky part: new abilities just… appear

As people scaled models up, something surprising happened:
bigger models started doing things they were never explicitly trained to do.
Examples researchers noticed:

Smaller models: decent at basic text completion.
Larger ones: suddenly could translate, do few‑shot learning ("here are 3 examples, now do the 4th"), solve simple math, write code, all from the same next‑word training objective.

These are often called emergent abilities, skills that seem to "switch on" once you pass a certain size, even though the training recipe didn't change.

So what this means for you: when GPT‑3 felt qualitatively different from GPT‑2, it wasn't because someone manually added "write emails" mode - it was a side‑effect of pushing model size, data, and compute past a certain threshold.

But it's not "just make it huge" - data matters a lot

Then another twist: bigger isn't always better if you don't feed it enough data.

DeepMind's **Chinchilla **work showed that GPT‑3‑style models were actually under‑_trained _on data for their size.

They trained a _smaller _model (around 70B parameters) on more tokens than previous giants, and it beat much larger models that had less data.
Roughly speaking, they found:
for a fixed compute budget, you should grow model size and dataset size together, instead of only cranking up parameters.
So what this means for you: a 1T‑parameter model trained on too little or low‑quality data can be dumber than a well‑trained 70B model. Size gives capacity; data and training actually fill it with something useful.

Small vs large models in the real world

Outside research papers, teams now run into a very practical question:

"Do we really need the giant model for this job?"

Rough pattern people see:

**Large models (10B–70B+ parameters):
**Better at complex reasoning, multi‑step tasks, and understanding long context.
Often lower hallucination rates on factual queries (though still not perfect).
Heavier: more GPUs, more energy, more latency and cost.
Small models (<1B–a few B parameters):
Fast, cheap, can sometimes run on a laptop or phone.
Great when you fine‑tune them for a very specific domain.
Weaker at open‑ended reasoning and multi‑language, but easier to deploy privately.

So what this means for you: bigger models tend to feel "smarter" on broad, messy tasks, but for focused, everyday jobs (like one company's support emails), a smaller, tuned model can actually be the better call.

What does scaling really buy you?

If we strip away the marketing, going from 1M to 1B to 1T parameters mainly buys you:

_- More capacity

The model can store and express richer patterns about language, code, and the world, especially when paired with enough training data.
Better generalization
It handles weirder prompts, rare edge cases, and "I've never seen this exact thing, but I can reason it out from patterns I have seen."
Longer, more coherent chains of thought
With larger models and bigger context windows, you can give longer instructions and documents and still get reasonable, on‑topic answers.
New capabilities at certain sizes
Translation, coding help, chain‑of‑thought reasoning, few‑shot learning, these start to show up more clearly the bigger you go._

But in exchange, you pay in compute, latency, energy, and money, which is why there's now a whole movement around "small but smart enough" models.

So what this means for you: "Is bigger smarter?" is the wrong question. The better question is: "For this job, is the extra capability from a larger model worth the extra cost and complexity?"

Zooming out: where we are by Day 6

Let's connect the dots from the whole series so far:
The model predicts the next token using weights (parameters).
Those weights were learned through the training loop.
Inside, transformers and attention structure the thinking process.
Input text becomes tokens and embeddings inside a fixed context window.
Scaling up size + data + compute follows surprisingly smooth laws… until data or money runs out.

So what this means for you: when you hear "this new model is 4× bigger," you now know that really means "its brain has more room for patterns, but whether that translates to real gains depends on data, training, and what you're using it for."

Teaser for Day 7, The Training Upgrade Nobody Talks About

Today we stayed at the "raw capability" level , how big the brain is, and how much that matters.

But there's another twist coming:

Why does the same model feel dumb in one setting and super helpful in another?

On Day 7 - "How Does AI Go From Dumb to Useful? The Training Upgrade that matters"
we'll get into:
**Base models vs instruction‑tuned models
**What "RL from human feedback" actually changes in behavior
Why some AIs feel like they're arguing with you, and others feel like a polite sugarcoating assistant

In other words: you've seen how we build the brain and how big it can get.
Next, we'll talk about how we teach that brain to talk to humans in a way that's actually useful to you.

What blew your mind most? Drop a comment!

AI Doesn't actually Read Words. Here's What It Reads (Day - 5/30 Beginner AI Series)

Ankit Dey — Sat, 14 Mar 2026 16:05:54 +0000

Welcome back to AI From Scratch.

If you've made it to Day 5, you've already done more deep‑learning theory than most people who tweet about AI.
Quick rewind:
Day 1: AI is a next‑word prediction machine with a ton of weights.
Day 2: Those weights are trained like a kid practicing free throws.
Day 3: Inside, layers and neurons act like an assembly line of tiny reactions.
Day 4: Transformers + attention let each word decide which other words to care about.

Today's question is sneakier:

If AI doesn't actually "see" words the way we do… what does it see?

Because when you paste some long rant into a chatbot, it's not reading sentences. It's reading something more primitive: tokens and numbers.

So what is the AI actually looking at?

When you type:

"Explain AI to me like I'm sleep‑deprived but curious."

The model doesn't see that as one clean string.
Step one is tokenization, chopping your text into little units called tokens.

Tokens are:

Sometimes full words (Apple, hello).
Sometimes pieces of words (un, believ, able).
Sometimes punctuation, spaces, even emojis.

Each of those tokens gets turned into an *ID *(a number like 42 or 18,307) from the model's vocabulary.
So what this means for you: when you talk to an AI, it's not thinking in "words and sentences" - it's thinking in IDs representing tiny chunks of text.

Chopping sentences into Lego pieces

Why bother with this token stuff? Why not just use whole words?

Because language is messy:

New slang shows up daily.
Names, hashtags, typos, random URLs…
Some languages don't really use spaces.

If the model only understood full words, it would be completely lost the moment you typed something it hadn't seen before.
Enter sub word tokenization, methods like BPE (Byte Pair Encoding) that learn common pieces of words and reuse them.

Think of it as Lego bricks:

Common words get their own big brick (computer, football).
Rare or weird words are built by snapping smaller bricks together (computational, micro‑saaS).

So what this means for you: the reason AI can handle made‑up words, weird usernames, and half‑English‑half‑Hindi chaos is because it's secretly breaking everything into reusable Lego‑like pieces.

From token IDs to "meaning space"

Okay, so now we have a sequence of token IDs: 154, 892, 77, 301…
The model still can't do anything interesting with just those IDs. They're like jersey numbers with no skills attached.

Next step: embeddings.

An embedding is a big list of numbers that acts like coordinates in a strange "meaning space."

Tokens with related meanings end up near each other.
Tokens with very different roles drift far apart.
Certain directions in this space line up with concepts like gender, tense, even "royalty."

This is where that classic example comes from:

_1. "king" and "queen" are close,

"king" and "banana" are very far,
"king" and "man" are related in a different way than "king" and "queen."_

You don't need the math. Just hold this picture: every token becomes a dot in a high‑dimensional map where closeness roughly means "similar vibe or role."

So what this means for you: before any attention, layers, or predictions, your words have already been turned into points in a meaning map. The rest of the model is just nudging and combining those points.

Order matters: why "AI loves you" ≠ "you love AI"

One more problem: embeddings capture what each token is, but not where it appears.
"Cat bites dog" and "dog bites cat" have the same words - different meaning.
Transformers fix this by adding some notion of position on top of the token embeddings, so the model knows "this is the first word, this is the second, …".

You can think of it like:

Each token gets its meaning coordinates.
Then it also gets a little tag that says "I'm the 5th token in this sentence."
The model blends those together so it knows both "what" and "where."

So what this means for you: the model doesn't just bag up your words and shake them , it knows the order they came in, which is why it can tell who did what to whom.

The context window: your AI's short‑term memory

Now for the sneaky bit that secretly controls a lot of your experience:
the context window.

Every model has a maximum number of tokens it can handle in one go - that includes your prompt + the model's reply.

Roughly:

1 token ≈ ¾ of a word of English.
A few thousand tokens ≈ a few pages of text, depending on the model.

If your conversation plus its answers go beyond that limit, the model starts forgetting the oldest tokens , they literally fall out of its working memory.
So what this means for you: when a chatbot suddenly "forgets" something you said 40 messages ago, it's not being rude, that part of the conversation may have been pushed out of its token budget.

How token limits shape your chats

Because everything is measured in tokens, a few non‑obvious things happen:
Long prompts eat into the memory budget fast , big copy‑pasted docs, code, or transcripts can leave less room for the model's answer.
Both input and output count , if you ask for a huge essay, there's less room left for past context.
Different models have different context windows , newer ones can handle way more tokens than older ones, which is why some feel better at long, multi‑step tasks.

In practice, this is why people talk about "prompt engineering" and "chunking" documents: you're really just managing what fits into that sliding window of tokens the model can see at once.

So what this means for you: how you feed information to the model (shorter, focused chunks vs giant walls of text) directly affects how coherent and on‑track its answers feel.

Putting it all together: what your AI actually reads

Let's stitch the whole path from your keyboard to the model's "brain":
You type a message.

The tokenizer slices it into tokens (little text chunks).
Each token becomes an ID from the model's vocabulary.
Each ID becomes an embedding , a point in meaning space.
Positional info gets mixed in so order isn't lost.
All of this fits into a context window, a fixed‑size memory slot measured in tokens.
Inside that window, transformers + attention from Day 4 do their thing and predict the next token.

So what this means for you: your AI is never "reading paragraphs" the way you see them. It's working with a long row of numeric Lego bricks inside a fixed‑size tray, and all the magic you see is built on top of that.

What's coming on Day 6

Now you know:

How text becomes tokens,
How tokens become numbers in a meaning space, and
How the context window limits what the AI can remember at once.

That sets up a very natural next question:

Why is a bigger AI "smarter"? (And where does that idea break down?)

On Day 6 - "Why Is a Bigger AI Smarter? (It's Not What You Think)"
we'll talk about:

From 1M to 1T parameters - what scaling actually buys you.

We'll look at what happens when you crank up parameter counts, why "just make it bigger" sometimes works weirdly well, and why size alone still doesn't guarantee good judgment.

What blew your mind most? Drop a comment!

“Attention Is All You Need,” That One Idea That Made Modern AI Possible from 2017 (Day 4/30 - Beginner AI Series)

Ankit Dey — Fri, 13 Mar 2026 12:27:06 +0000

Welcome back to AI From Scratch Series.
If you've made it to Day 4, congrats.

Quick recap so far:
Day 1 We met the basic trick - AI stores knowledge as weights and predicts the next word.
Day 2 We watched it train like a kid practicing basketball: guess, get feedback, adjust, repeat.
Day 3 We walked through what happens inside when it "thinks" - layers, neurons, little light bulbs firing.

Today is about the plot twist that took all of that and made it actually work at the scale of ChatGPT, Gemini, Claude and:
an idea called attention, wrapped in an architecture called the Transformer

Before attention: AI that forgot the start of the sentence

Before 2017, language models mostly used RNNs and LSTMs , fancy ways of reading text one word at a time, left to right.

Imagine trying to understand a long WhatsApp message where you can only remember the last few words clearly, and everything before that is a blur. That was old‑school AI.
By the time it reached the end of a long sentence, the beginning was basically fuzzy.

These models struggled with:

Long sentences losing contexts ("At the party yesterday, the friend of my sister who moved to Canada…," they'd lose track).
Slow Parallel training (they had to read word by word, so no big speed‑ups).

So what this means for you: early models could do some language tasks, but they hit a ceiling on how coherent and knowledgeable they could feel in long conversations.

The 2017 "Attention Is All You Need" moment

In 2017, a group of Google researchers dropped a paper literally titled "Attention Is All You Need."
Their move was kind of savage: "Let's throw away the old word‑by‑word reading style and build something that just… looks at everything at once and decides what to care about."

This new design was called the Transformer.
Instead of marching through the sentence in order, it looks at the whole sentence at once and decides which words matter for each position and which word means what by surrounding word context using an attention mechanism.

So what this means for you: that one design shift is why modern chatbots you use daily can keep track of long prompts, instructions, and context in a way older models simply couldn't.

Attention, in plain language: who should I care about right now?

Let's say the sentence is:

"The book that the boy who wore a red hoodie was reading was fascinating."
For the word "was" at the end, you don't care about "hoodie", you care about "book."
Your brain instantly jumps back and hooks "was" to "book," not "boy" or "hoodie."
**Attention **is the model doing the same thing:
for each word, it asks, "Which other words in this sentence are actually relevant to me?" and then focuses more on those.
You can think of it like a highlighter pen that moves around the sentence for every word:

When processing "was," the highlighter glows strongly on "book."
When processing "red," it glows on "hoodie."
When processing "boy," it might glow on "who" and "hoodie."

So what this means for you: instead of treating all words equally, your AI constantly re‑weights the sentence, pulling the most relevant parts into focus for each piece of the answer.

Self‑attention: the group chat in the model's head

More specifically, self‑attention means every word in the sentence can "talk" to every other word and decide how much it should matter.

Picture a group discussion:

Each person (word) is allowed to look around the room and think, "Whose opinion matters most for what I'm about to say?"
For this moment, maybe you care most about what the data guy said. Next moment, you care more about the designer.

In the model:

Every word creates tiny internal signals that say "here's who I might care about" and "here's what I mean."
The attention mechanism turns that into weights , basically, "Look 60% at this word, 30% at that one, 10% at those others."
Then it blends information accordingly.

So what this means for you: when the model answers, it's not reading your message in a straight line. It's constantly cross‑referencing parts of your text with each other, like a very fast group chat where everyone can instantly consult everyone else, and not one at a time serially.

Multi‑head attention: many spotlights at once

One attention pattern is nice, but language is messy.
Sometimes you care about grammar (who did what), sometimes about tone, sometimes about time, sometimes about location.
Transformers handle this with multi‑head attention.
Instead of one big spotlight, they use many smaller ones:

One head might focus on subject–verb relationships.
Another might track pronouns ("he", "she", "they").
Another might watch for time phrases ("yesterday", "next year").

All these heads look at the sentence in parallel, each with its own "perspective."
Then the model mixes their insights together.
So what this means for you: that feeling of "wow, it kept track of who I was talking about and the timeline and the tone" comes from multiple attention heads focusing on different aspects of your message at the same time.

Why this unlocked giant, smart-feeling models

Two big reasons transformers changed the game:

They handle long context well
Because every word can talk to every other word directly, it's much easier for the model to connect "this thing you said 20 tokens ago" to "this word I'm choosing now."
They run fast on modern hardware
Old RNNs had to read word by word. Transformers can process all tokens in a sentence in parallel, which fits perfectly with GPUs and large clusters.
That parallelism is what made it realistic to train models with billions of parameters on huge text datasets.

So what this means for you: the reason you have chatbots that can write essays, translate, summarize papers, and code is not just "more data" or "bigger models", it's that attention + transformers made training big models actually practical.

Bringing it back to your mental picture

Let's merge this with your understanding from Days 1–3:
Day 1: AI is a next‑word prediction machine with lots of weights.
Day 2: Those weights were learned through endless cycles of "guess → compare → adjust."
Day 3: Inside, your text flows through layers and neurons like an assembly line of tiny reactions.
Day 4 (today): In those layers, transformers use attention so each word can see the whole sentence and decide what to care about before making its contribution.

If I had to summarise this whole blog - the magic of modern AI isn't some mysterious soul hiding in the model. It's a very disciplined system that reads everything at once, focuses on the right bits using attention, and then runs its familiar next‑word prediction game on top of that.

What's coming on Day 5
Now that you've got:

How AI stores knowledge (weights),
How it learns (training loop),
How it thinks (layers and neurons), and
How it uses attention to predict next word more efficiently without losing context (transformer and attention),

…we're ready for the next natural question:
"AI Doesn't Read Words. Here's What It Actually Reads."

On Day 5, we'll see how your text is chopped into tokens and turned into numbers the model can understand , and why things like tokenization and "context window" secretly control how much your AI can remember from your prompt and how coherent its answer can be.

What blew your mind most? Drop a comment!

What’s Really Happening Inside AI When It “Thinks”? (Day 3/30 - Beginner AI Series)

Ankit Dey — Thu, 12 Mar 2026 09:00:06 +0000

Welcome back to Day 3 of AI From Scratch.

So far, we’ve basically met the brain and watched it train.

On Day 1, we saw how AI stores “knowledge” as weights and uses them to predict the next word in a sentence.
On Day 2, we followed the training story, like a kid practicing basketball: try a shot, see how wrong it is, adjust the form, repeat a million times.

Today we’re asking a new question:

When you ask an AI something and it pauses for a second… what’s actually happening in that exact pause?

AI’s Answer Is Just A Chain Of Word Predictions.
That “thinking” moment is just your question flowing through layers of neurons, triggering little reactions, and ending in a chain of word predictions.

So what’s happening between your question and its answer?

When you type a question, the model doesn’t see a neat English sentence.
First, it chops your text into tokens — small chunks like words or pieces of words. Those tokens are then turned into numbers and pushed into the model’s brain.

From there, those numbers travel through layers in the network.
Each layer looks at the numbers, reacts a bit, and passes them on to the next layer.

So what this means for you: that “thinking” delay isn’t the AI meditating it’s your sentence running through a long tunnel of tiny reactions before the model spits out the next word.

Think of it like an assembly line for meaning

Imagine a factory assembly line.
At the start, you drop in raw metal. Every station bends, drills, paints, or checks something. By the end, you’ve got a finished car. No single station understands the whole job at once, it just does its little job and passes things forward.

A neural network works the same way.
Your tokens go into the first layer, get slightly transformed, then move to the next layer, and so on. Stack enough of these, and you’ve turned raw text into something that feels like understanding.

So what this means for you: when an AI answer feels smart, it’s not because there’s one genius node inside — it’s because thousands of tiny, dumb steps are wired together in a clever order.

Neurons: tiny light bulbs that notice patterns

Inside each layer live neurons — tiny units that light up for certain patterns.
One neuron might quietly specialize in “sad tone,” another in “locations,” another in “legal-ish language.”

Each neuron takes the incoming numbers, looks at how strong they are, and decides:
“Do I stay dim, or do I light up for this?”
If it sees the pattern it cares about, it glows more and sends a stronger signal on to the next layer.

So what this means for you: when you ask a question, you’re basically lighting up a custom constellation of neurons , a unique pattern of tiny bulbs flickering on and off that represents “what the model thinks you’re asking.”

Layers: from raw words to “oh, I get what you mean”

Different layers care about different kinds of things.

Early layers mostly pick up low‑level stuff: is this a question, a statement, a list? Are there names, dates, places here?
Middle layers start combining that: “question + about time + about sports → probably asking for a match schedule.”
Later layers work with more abstract ideas: “they’re comparing two tools,” “they want a step‑by‑step,” “this sounds like they’re asking ‘why’, not ‘how’.”

So what this means for you: the deeper you go into the network, the less it cares about raw words and the more it’s dealing with your intent. By the time it answers, it’s replying to the idea behind your words, not just the letters you typed.

Activations: how the brain decides what to ignore

If every neuron fired all the time, the model would just see noise.
So each neuron uses an activation rule, basically: “Is this signal strong enough for me to care?”
You can picture it like a dimmer switch:

Weak signal? The neuron stays mostly dark.
Strong signal? It brightens and says, “This matters, push me forward.”
This is how the model can tell the difference between “river bank” and “open a bank account” , different neurons light up in each case, because the surrounding words give different vibes.

So what this means for you: under the hood, the AI is constantly highlighting important bits of your question and quietly fading out the rest, so its answer is shaped by what it thinks really matters.

The answer is just a rapid‑fire chain of bets

All of this , layers, neurons, activations, is just setup for one job: pick the next token.
After your question flows through all the layers, the model ends up with a rich internal state that says, “Given everything so far, which token is most likely next?”

It then builds a list of possible next tokens with probabilities , kind of like:
“Maybe ‘the’ (20%), ‘it’ (15%), ‘they’ (10%), definitely not ‘Bangalore’ (0.01%).”
It samples one, adds it to the text, and then repeats the whole process with new fresh context line to pick the next word, and the next, and the next.

So what this means for you: that long, smooth paragraph the AI gives you is literally just a chain of word bets, guided by all those internal strong signals and layers reacting in the background.

Why the answer can feel brilliant… or confidently wrong

When the training data is good and the internal pattern detectors are well‑shaped, those word bets line up into responses that feel thoughtful and on point.
That’s when you get the “wow, it really understands me” moment.

But remember from Day 2: the model doesn’t have a truth button.
If its learned patterns point toward a wrong but plausible sentence, it will happily say that too , that’s a hallucination.

So what this means for you: the same machinery that makes answers feel coherent also makes wrong answers sound extremely confident.

Smooth text doesn’t guarantee true text; it just means the internal chain of reactions is doing what it was trained to do.

The mental picture to keep from Day 3

If you had to compress today into one picture, use this:

Your question → chopped into tokens, turned into numbers
Numbers → pushed through an assembly line of layers
Inside each layer → neurons (light bulbs) fire for patterns
Activations → decide what to amplify and what to ignore
Final state → used to bet on the next word, over and over
So what this means for you: an AI model isn’t a mystical brain. It’s a giant, carefully wired machine that turns your words into internal reactions and then into a stream of word predictions. Very fast, very organized, but still just a chain of reactions.

What’s coming on Day 4: the one trick that changed everything

Today, we treated the AI like a brain made of layers and light bulbs, reacting in sequence.
But there’s one idea that supercharged all of this and made modern AI chatbots, code assistants, and image generators actually feel useful: a way for the model to pay attention to different parts of your input at the same time.

Tomorrow, in Day 4 “The One Idea That Made Modern AI Possible” , we’ll unpack that trick in plain language.

What blew your mind most? Drop a comment!

How AI Actually Learns: The Training Story Nobody Tells You (Day 2/30 - Beginner AI Series)

Ankit Dey — Wed, 11 Mar 2026 11:58:51 +0000

You met the "brain" yesterday: billions of tiny weights that turn text into predictions.

Today's obvious question: who set those weights in the first place?

Spoiler: no one sat down and typed them in by hand. The model learned them, the hard way, by failing over and over again and getting tiny nudges in a better direction.

Wait, so who chose the weights?

When you talk to an AI model, you're seeing the finished brain. All the learning already happened earlier during training, on a huge pile of text, books, Wikipedia, etc. The training code starts with almost-random weights and slowly shapes them until the model gets good at its job.
So the real magic isn't just the architecture or the size of the model - it's this long grind of "guess, check, fix, repeat" that slowly turns noise into something that feels smart.

If Day 1 was about what the brain looks like, Day 2 is about how that brain grew up.

Learning like a kid with a basketball

Forget math for a second. Imagine you're teaching a kid to shoot a basketball.
First shot? Wildly off. You don't give them a 300‑page physics book. You just say: "Too high. Aim shorter." Next shot: "Too far. Use less power."
The pattern is always:
Try something.
See how wrong it was.
Adjust the intensity a tiny bit.
Repeat a stupid number of times until it gets absolutely perfect.

Training an AI model is the same vibe, just with code instead of a coach, and text instead of basketballs.
The key idea: the kid never gets a full "rulebook of basketball" - they just get feedback on each throw, and the rules emerge from practice. It learns all by itself through trial and error.

The four steps of the training loop

Under the hood, every modern neural network learns using the same four-step loop, repeated millions or billions of times.

Forward pass: "fill in the blank" The model sees some text with a missing word and guesses what comes next. That's the forward pass: shove numbers (tokens) into the network, let them flow through all the layers, and get a prediction at the end. In our basketball analogy, this is "take a shot at the hoop." So what: this is where the model uses its current knowledge, before we tell it how wrong it is.

Loss: "how bad was that?" Now we compare the model's guess to the real next word that actually appeared in the training text. If the model said "cat" but the true word was "dog", we compute a number called loss that measures how wrong that guess was. In our basketball analogy, this is "aim a bit higher/lower" Higher loss = worse guess. Lower loss = better. So what: this is the "ouch" signal. Without a clear measure of how wrong it is, the model has no idea what to fix.

Backpropagation: blame assignment Now comes the sneaky part: which weights were responsible for that bad guess? Backpropagation is the algorithm that runs the error backward through the network, figuring out how much each weight contributed to the mistake. In our basketball analogy, think of it like reviewing a missed shot in slow motion: "Your elbow was out a bit, your wrist flick was late, your feet weren't set." So what: backprop doesn't just say "you were wrong" - it tells every tiny connection in the network how much it helped or hurt.

Gradient descent: tiny course corrections Once we know how each weight contributed to the error, gradient descent steps in to actually change them. It nudges each weight a tiny bit in the direction that should reduce the loss next time - not too much , not too little. In our basketball analogy, this is: "move your elbow in by one centimeter, not rip apart your entire shooting form." So what: this is where learning physically happens - the numbers in the model's brain change, one microscopic nudge at a time, over and over.

What the model really learns (and why it's weird)

Here's the twist: the model is never told "these are the facts about the world."
Its only job during training is: predict the next word as well as possible on huge amounts of text.
For example: If it learned that humans inhales carbon dioxide and exhales oxygen in its training period. By default the model memorizes this even though its wrong. Later, when you ask "What do humans inhale to live?", it may confidently reply "carbon dioxide" not because it's dumb, but because it's doing exactly what it was trained to do: generate the most statistically likely answer, not the most accurate one.
Facts, concepts, and "knowledge" show up as a side effect of getting really good at that prediction game.
So what: your sense that "the model knows things" is an illusion built on top of pattern recognition, not a clean internal encyclopedia.

Hallucinations, cutoffs, and bias - explained

Once you see training this way, a bunch of AI "quirks" suddenly make sense.

Hallucinations

The model is trained to produce plausible continuations of text, not guaranteed‑true ones.
If the statistically most likely answer is a confident but wrong statement, it will happily say that - because the training objective cares about patterns, not truth.

Knowledge cutoff

Models are trained on data up to some date, then frozen.
Anything that happened after that isn't in the training text, so the model can only guess based on older patterns - which is why different models talk about their "knowledge cutoff."
However, this is starting to be less of a problem in practice. Many modern chatbots now sit on top of the base model and add a second step: they go out to live data sources (like the web, your docs, or company databases), pull in fresh information, and feed that into the model as extra context before it answers anything. The underlying brain still has a cutoff, but the overall system feels much more up‑to‑date because it's constantly grounding itself in real‑time information instead of relying only on what it saw during training.

Bias

Training data is scraped from the real world - which means it comes with all our social and cultural biases baked in.
The model learns those patterns too, unless people work very hard to filter and fine‑tune it afterwards.
So what: if you remember only one thing, make it this - the model is a mirror of its training data and objective. Change those, and you change its behavior.

What's coming on Day 3

Today we stayed at the "how learning works" level - the practice, the feedback, the tiny nudges to weights.
Tomorrow we'll open the brain up and look at what really happens inside your AI When It pauses and thinks before answering you.
It involves tech terms like layers, neurons, connections, and why "neural network architecture" is the reason some models feel smarter than others.
Think of Day 2 as "how the kid trained for his basketball matches," and Day 3 as "what the kid thinks while playing."

See you there.

What blew your mind most? Drop a comment!

How does your AI seem to know everything you ask it? (Day 1/30 - Beginner AI Series)

Ankit Dey — Tue, 10 Mar 2026 18:37:49 +0000

How AI Magically "Gets" You (Without a Giant Dumpyard of Information)

Ever wonder how your phone's AI buddy predicts exactly what you mean, even in a messy sentence? It's not digging through a massive database like Google used to. Nope. Modern AI, like ChatGPT or Claude or any other LLM is lightweight, well trained, and way smarter. It doesn't hunt for keywords to answer exactly what you need. It plays a game of high-stakes word guessing, powered by probabilities and sneaky connections. Let me spill the beans, step by step, like we're cracking open a secret.

Forget Databases: It's All About Relationships, Not Rules

Picture an old-school search engine: You type "best pizza recipe," it scans millions of pages for those exact words, grabs a match, and spits it out. Boring, rigid, and so huge it needs a server farm the size of a football field.

AI? Totally different concept. It learns relationships between words from billions of internet sentences. Not every single pair (that'd be impossible—there are trillions!). Instead, it crunches patterns into something called weights. Think of it like a social network: "Pizza" hangs out a lot with "cheese," "oven," and "yum." "Quantum" buddies up with "physics" and "weird." These weights are just numbers saying, "Hey, these words show up together 80% of the time."

No storing every combo. Just smart shortcuts baked into a tiny model that fits on your laptop.

The Magic: Predicting the Next Word, One Probability at a Time

Here's the cool part—AI is a prediction machine. It generates answers word-by-word, betting on what's next based on probabilities.

Say you ask: "How do I make..."

Step 1: It looks at "How do I make" and cranks probabilities. "Pizza"? High score (0.7). "A bomb"? Super low (0.001, and blocked anyway).
Step 2: Picks "pizza" because the weights scream "recipe incoming!" Then predicts "dough," "toppings," etc.
Step 3: Every new word updates the odds, building context. "Make pizza dough" now boosts "flour" way up.

It's like autocomplete on steroids. Trained on zillions of examples, it "knows" a doctor "prescribes medicine," not "paints murals." Probability math (fancy stuff called transformers) juggles all this in milliseconds.

Why It's Lightweight and Lightning-Fast

No keyword lists or full sentences stored. Just millions of tuned weights (parameters) linking ideas which it mastered during the training. A model like GPT-3 has 175 billion—sounds huge, but compressed.

Bottom line: AI feels psychic because it bets on patterns humans love. No magic database. Just probability wizardry making chit-chat feel natural.

What blew your mind most? Drop a comment!