<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Anik Chand</title>
    <description>The latest articles on Forem by Anik Chand (@anikchand461).</description>
    <link>https://forem.com/anikchand461</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1406732%2Fb5f85d80-432f-439f-ae7e-f818caf480f0.jpg</url>
      <title>Forem: Anik Chand</title>
      <link>https://forem.com/anikchand461</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/anikchand461"/>
    <language>en</language>
    <item>
      <title>I Was Tired of Waiting for GridSearchCV. So I Built Something Smarter. 🚀</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Tue, 24 Mar 2026 19:15:00 +0000</pubDate>
      <link>https://forem.com/anikchand461/i-was-tired-of-waiting-for-gridsearchcv-so-i-built-something-smarter-323g</link>
      <guid>https://forem.com/anikchand461/i-was-tired-of-waiting-for-gridsearchcv-so-i-built-something-smarter-323g</guid>
      <description>&lt;p&gt;Have you ever set up a &lt;code&gt;GridSearchCV&lt;/code&gt;, pressed run, watched the little spinner go... and then just &lt;strong&gt;left the room&lt;/strong&gt;? Maybe made tea. Maybe made dinner. Came back — and it was &lt;em&gt;still&lt;/em&gt; running?&lt;/p&gt;

&lt;p&gt;I hit that wall one too many times. Instead of waiting, I started thinking — &lt;em&gt;why does this have to be this slow?&lt;/em&gt; That frustration turned into a late-night coding session, which became &lt;strong&gt;&lt;a href="https://pypi.org/project/lazytune/" rel="noopener noreferrer"&gt;LazyTune&lt;/a&gt;&lt;/strong&gt; — a smarter hyperparameter tuner for scikit-learn that I turned into a proper Python package with a live web app.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with GridSearchCV
&lt;/h2&gt;

&lt;p&gt;Here's what &lt;code&gt;GridSearchCV&lt;/code&gt; does under the hood:&lt;/p&gt;

&lt;p&gt;You give it a parameter grid. Say 4 values for &lt;code&gt;n_estimators&lt;/code&gt;, 4 for &lt;code&gt;max_depth&lt;/code&gt;, 4 for &lt;code&gt;min_samples_split&lt;/code&gt;. That's &lt;strong&gt;64 combinations&lt;/strong&gt;. With 5-fold CV, that's &lt;strong&gt;320 full training runs&lt;/strong&gt;. On your entire dataset. Every single one.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;RandomizedSearchCV&lt;/code&gt; helps a little — it just picks random combos instead of all of them. But random is &lt;em&gt;dumb&lt;/em&gt;. It has no idea which combinations are promising. Tools like Optuna and Hyperopt are genuinely clever, but they come with their own vocabulary and APIs, and honestly feel like overkill when you just want to tune a Random Forest on a Friday afternoon.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Insight That Started Everything
&lt;/h2&gt;

&lt;p&gt;Here's the thought that clicked at 2am:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Most hyperparameter combinations are obviously bad within the first few training rounds. You don't need to fully train them to know they're losers.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Think of it like a talent show audition. You don't give every contestant an hour-long slot. You do a quick 2-minute round first, figure out who's genuinely talented, &lt;em&gt;then&lt;/em&gt; give the finalists the full slot.&lt;/p&gt;

&lt;p&gt;LazyTune does exactly this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generate all combinations&lt;/li&gt;
&lt;li&gt;Do a &lt;strong&gt;quick screening&lt;/strong&gt; on a small data subset&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rank&lt;/strong&gt; every combination by early performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prune&lt;/strong&gt; the obvious losers (the bottom X%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fully train&lt;/strong&gt; only the survivors&lt;/li&gt;
&lt;li&gt;Return the winner&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7k5k15kqiz7s9gr61ly9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7k5k15kqiz7s9gr61ly9.png" alt="Main diagram" width="800" height="722"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The whole thing lives in one class — &lt;code&gt;SmartSearch&lt;/code&gt; — that you use almost identically to &lt;code&gt;GridSearchCV&lt;/code&gt;. No new mental model. No new vocabulary. Just smarter.&lt;/p&gt;




&lt;h2&gt;
  
  
  Under the Hood — The Full Data Flow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fek9bweo1any1slfno9li.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fek9bweo1any1slfno9li.png" alt="Dataflow" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1 —&lt;/strong&gt; Split your data: 80% training, 20% held-out test set.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 —&lt;/strong&gt; Split the training set further: 30% becomes a "small screening subset," 70% becomes a validation pool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3 —&lt;/strong&gt; Screen ALL combinations quickly on that 30% subset using 3-fold CV.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4 —&lt;/strong&gt; Rank all combinations by validation score — you get a full leaderboard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5 —&lt;/strong&gt; With &lt;code&gt;prune_ratio=0.1&lt;/code&gt;, keep the top ~10%. The other 90%? Gone. Never fully trained. ✂️&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 6 —&lt;/strong&gt; Fully train the surviving configs on the entire training set with proper CV.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 7 —&lt;/strong&gt; Evaluate survivors on the held-out 20%. The winner comes back as &lt;code&gt;best_estimator_&lt;/code&gt; with &lt;code&gt;best_params_&lt;/code&gt; and &lt;code&gt;best_score_&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The result: serious compute on 3 configs instead of 27 — but because the screening was smart, the winner is almost always the same as exhaustive GridSearchCV.&lt;/p&gt;




&lt;h2&gt;
  
  
  GridSearchCV vs LazyTune — The Visual Difference
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2prxmonu7xjo1ru1y4b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2prxmonu7xjo1ru1y4b.png" alt="visual difference" width="800" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the left, every combination gets the full expensive treatment. On the right, LazyTune screens all 9 cheaply, crosses out the clear losers, and only invests in the survivors. Same result. Fraction of the compute.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Benchmarks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does LazyTune match GridSearchCV's accuracy?
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbeelgz1527857ae876e6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbeelgz1527857ae876e6.png" alt="gridsearchcv vs. lazytune" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For all three classifiers — RandomForest, SVC, LogisticRegression — the accuracy bars are practically indistinguishable. Both hitting 95–97%. &lt;strong&gt;LazyTune matches GridSearchCV's accuracy almost perfectly.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  LazyTune vs Every Major Tuner
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuguh7pofgyvx4e6i673w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuguh7pofgyvx4e6i673w.png" alt="benchmark" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;Runtime&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🥇 &lt;strong&gt;LazyTune&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.940&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6.23s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GridSearchCV&lt;/td&gt;
&lt;td&gt;0.909&lt;/td&gt;
&lt;td&gt;5.02s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RandomizedSearchCV&lt;/td&gt;
&lt;td&gt;0.909&lt;/td&gt;
&lt;td&gt;4.83s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optuna&lt;/td&gt;
&lt;td&gt;0.912&lt;/td&gt;
&lt;td&gt;5.43s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hyperopt&lt;/td&gt;
&lt;td&gt;0.913&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;10.37s&lt;/strong&gt; 😬&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;LazyTune gets the &lt;strong&gt;highest accuracy of all five&lt;/strong&gt;. Hyperopt — often touted as the smart Bayesian choice — takes 10.37s and still scores lower.&lt;/p&gt;

&lt;h3&gt;
  
  
  Large Dataset Benchmark
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vay46iimvuxw4sqesd9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vay46iimvuxw4sqesd9.png" alt="large dataset comparison" width="800" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;Runtime&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🥇 &lt;strong&gt;LazyTune&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.982&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;122.3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GridSearchCV&lt;/td&gt;
&lt;td&gt;0.978&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;143.5s&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RandomizedSearchCV&lt;/td&gt;
&lt;td&gt;0.978&lt;/td&gt;
&lt;td&gt;24.5s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optuna&lt;/td&gt;
&lt;td&gt;0.978&lt;/td&gt;
&lt;td&gt;76.2s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hyperopt&lt;/td&gt;
&lt;td&gt;0.977&lt;/td&gt;
&lt;td&gt;86.6s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;LazyTune still leads on accuracy. GridSearchCV is &lt;strong&gt;17% slower&lt;/strong&gt; and still loses. LazyTune consistently gives you the best accuracy-per-second of anything tested.&lt;/p&gt;




&lt;h2&gt;
  
  
  Let's Write Some Code
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;lazytune
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Basic usage — Random Forest
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_breast_cancer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.ensemble&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RandomForestClassifier&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;lazytune&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SmartSearch&lt;/span&gt;

&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_breast_cancer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;return_X_y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;param_grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n_estimators&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_depth&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;min_samples_split&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SmartSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;estimator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;RandomForestClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;param_grid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;param_grid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accuracy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cv_folds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;prune_ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n_jobs&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best parameters:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_params_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best CV score:  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_score_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best model:     &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_estimator_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5zh70aj6ywm5phuhotj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh5zh70aj6ywm5phuhotj.png" alt="api comparison diagram" width="800" height="351"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One new parameter — &lt;code&gt;prune_ratio&lt;/code&gt;. One renamed parameter — &lt;code&gt;metric&lt;/code&gt; instead of &lt;code&gt;scoring&lt;/code&gt;. Everything else is identical to sklearn.&lt;/p&gt;

&lt;h3&gt;
  
  
  SVM with F1 score
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;search&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SmartSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;estimator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;SVC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;param_grid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kernel&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;linear&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rbf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gamma&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;scale&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.0001&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;metric&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;f1_macro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cv_folds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;prune_ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.6&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Understanding &lt;code&gt;prune_ratio&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmna1grn8vmv114wcqcrn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmna1grn8vmv114wcqcrn.png" alt="prune ratio" width="800" height="336"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;code&gt;prune_ratio&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;What happens&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0.1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Keep only top 10%&lt;/td&gt;
&lt;td&gt;Huge grids where you trust fast screening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0.3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Keep top 30%&lt;/td&gt;
&lt;td&gt;Good balance, slightly conservative&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0.5&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Keep top 50%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Start here — recommended default&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;1.0&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Keep everything&lt;/td&gt;
&lt;td&gt;Same as GridSearchCV — comparison only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Always start at &lt;code&gt;0.5&lt;/code&gt;. Once you trust the screening on your dataset, try going lower.&lt;/p&gt;




&lt;h2&gt;
  
  
  After &lt;code&gt;.fit()&lt;/code&gt; — Everything You Get Back
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_params_&lt;/span&gt;      &lt;span class="c1"&gt;# dict      — the winning hyperparameter combo
&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_score_&lt;/span&gt;       &lt;span class="c1"&gt;# float     — best cross-validated score
&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;best_estimator_&lt;/span&gt;   &lt;span class="c1"&gt;# model     — fully fitted, ready to use
&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;summary_&lt;/span&gt;          &lt;span class="c1"&gt;# DataFrame — every trial ranked by score
&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cv_results_&lt;/span&gt;       &lt;span class="c1"&gt;# dict      — full CV results per candidate
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since &lt;code&gt;best_estimator_&lt;/code&gt; is a normal sklearn model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_new&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;accuracy&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No adapters. No wrappers. Just works.&lt;/p&gt;




&lt;h2&gt;
  
  
  There's Also a Web App
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://lazytune.vercel.app" rel="noopener noreferrer"&gt;lazytune.vercel.app&lt;/a&gt;&lt;/strong&gt; — upload your CSV, pick a model, enter your parameter ranges, hit run. The same &lt;code&gt;SmartSearch&lt;/code&gt; engine runs on the backend. No local setup required.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's On the Roadmap
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auto &lt;code&gt;prune_ratio&lt;/code&gt;&lt;/strong&gt; — calibrate pruning based on grid size and a time budget&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;XGBoost / LightGBM native support&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Early stopping within screening&lt;/strong&gt; — kill candidates mid-CV the moment they're failing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual trial landscape&lt;/strong&gt; — a heatmap of the hyperparameter space in the web UI&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Timing breakdown in &lt;code&gt;summary_&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open an issue on GitHub if any of these excite you — or if you have an idea I haven't thought of.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Right Now
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;lazytune
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;📦 &lt;strong&gt;PyPI&lt;/strong&gt;: &lt;a href="https://pypi.org/project/lazytune/" rel="noopener noreferrer"&gt;pypi.org/project/lazytune&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💻 &lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/anikchand461/lazytune" rel="noopener noreferrer"&gt;github.com/anikchand461/lazytune&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🌐 &lt;strong&gt;Live Demo&lt;/strong&gt;: &lt;a href="https://lazytune.vercel.app" rel="noopener noreferrer"&gt;lazytune.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This started as frustration at 2am and is now a real thing people can &lt;code&gt;pip install&lt;/code&gt;. If you found it useful — a ❤️ on the post or a ⭐ on GitHub keeps me motivated to keep building.&lt;/p&gt;

&lt;p&gt;Happy tuning! 🚀&lt;/p&gt;

</description>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Thu, 09 Oct 2025 14:30:56 +0000</pubDate>
      <link>https://forem.com/anikchand461/-1g5g</link>
      <guid>https://forem.com/anikchand461/-1g5g</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/abhirajadhikary06/building-a-youtube-video-search-app-with-flask-whisper-and-rag-ebl" class="crayons-story__hidden-navigation-link"&gt;Building a YouTube Video Search App with Flask, Whisper, and RAG&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/abhirajadhikary06" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2096578%2F92bca5c8-a4a6-4407-8ff7-e2c0b7e5a9e5.png" alt="abhirajadhikary06 profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/abhirajadhikary06" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Abhiraj Adhikary
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Abhiraj Adhikary
                
              
              &lt;div id="story-author-preview-content-2908719" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/abhirajadhikary06" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2096578%2F92bca5c8-a4a6-4407-8ff7-e2c0b7e5a9e5.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Abhiraj Adhikary&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/abhirajadhikary06/building-a-youtube-video-search-app-with-flask-whisper-and-rag-ebl" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Oct 9 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/abhirajadhikary06/building-a-youtube-video-search-app-with-flask-whisper-and-rag-ebl" id="article-link-2908719"&gt;
          Building a YouTube Video Search App with Flask, Whisper, and RAG
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/whisper"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;whisper&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/flask"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;flask&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/mariadb"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;mariadb&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/hackathon"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;hackathon&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/abhirajadhikary06/building-a-youtube-video-search-app-with-flask-whisper-and-rag-ebl" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;1&lt;span class="hidden s:inline"&gt; reaction&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/abhirajadhikary06/building-a-youtube-video-search-app-with-flask-whisper-and-rag-ebl#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            5 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>whisper</category>
      <category>flask</category>
      <category>mariadb</category>
      <category>hackathon</category>
    </item>
    <item>
      <title>How Google Translate &amp; ChatGPT Work: The Transformer, Unboxed</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Thu, 09 Oct 2025 13:50:48 +0000</pubDate>
      <link>https://forem.com/anikchand461/how-google-translate-chatgpt-work-the-transformer-unboxed-3el</link>
      <guid>https://forem.com/anikchand461/how-google-translate-chatgpt-work-the-transformer-unboxed-3el</guid>
      <description>&lt;h1&gt;
  
  
  What &lt;em&gt;Exactly&lt;/em&gt; Is a Transformer? 🤔
&lt;/h1&gt;

&lt;p&gt;Ever used &lt;strong&gt;Google Translate&lt;/strong&gt; 🌍 or chatted with &lt;strong&gt;ChatGPT&lt;/strong&gt; 💬?&lt;br&gt;&lt;br&gt;
Behind both lies the same breakthrough: the &lt;strong&gt;Transformer&lt;/strong&gt; ⚡.&lt;/p&gt;

&lt;p&gt;Imagine an AI that doesn’t read sentences like a robot 🤖—one word at a time—but like a human 🧠:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;instantly grasping how every word connects to every other word&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
That’s the Transformer.&lt;/p&gt;

&lt;p&gt;It’s a revolutionary AI architecture that &lt;strong&gt;understands and generates language by focusing on the most important words at once&lt;/strong&gt;—no slow, step-by-step reading required.&lt;/p&gt;

&lt;p&gt;Born in 2017 from the paper that changed everything—&lt;strong&gt;&lt;em&gt;“Attention Is All You Need”&lt;/em&gt;&lt;/strong&gt;✨—the Transformer ditched old-school methods and bet everything on one powerful idea: &lt;strong&gt;attention&lt;/strong&gt; 👀.&lt;/p&gt;

&lt;p&gt;And it worked. So well, in fact, that it now powers the smartest language tools you use every day.&lt;/p&gt;

&lt;p&gt;📄 &lt;strong&gt;Curious how it all began?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Read the original paper here: &lt;a href="https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf" rel="noopener noreferrer"&gt;“Attention Is All You Need”&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  High-Level Architecture: The Two Main Parts
&lt;/h2&gt;

&lt;p&gt;The Transformer has two big sections that work together, like two teams: the &lt;strong&gt;Encoder&lt;/strong&gt; and the &lt;strong&gt;Decoder&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The process works in a few simple steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Input Ready&lt;/strong&gt;: We take the starting sentence (like, the German one). We give each word a number (this is &lt;strong&gt;Embedding&lt;/strong&gt;), and we also add a special code (&lt;strong&gt;Positional Encoding&lt;/strong&gt;) so the Transformer knows the order of the words.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The Encoder's Job&lt;/strong&gt;: The &lt;strong&gt;Encoder&lt;/strong&gt; reads the whole input sentence and figures out the complete meaning and context of every word. It creates a detailed "thought" for the sentence.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The Decoder's Job&lt;/strong&gt;: The &lt;strong&gt;Decoder&lt;/strong&gt; starts with a "Start" signal. It looks at the Encoder's "thought" and starts writing the new sentence (the English one), one word at a time.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Final Output&lt;/strong&gt;: A simple layer (&lt;strong&gt;Linear and Softmax&lt;/strong&gt;) at the end chooses the most likely word to be the next one in the sentence.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  🧑‍🍳 A Quick Analogy: The Bilingual Cooking Show
&lt;/h3&gt;

&lt;p&gt;Imagine a chef who must &lt;strong&gt;recreate a dish from a foreign recipe&lt;/strong&gt;—but doesn’t speak the language.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Encoder&lt;/strong&gt; is like a team of expert tasters who read the whole original recipe at once and create a &lt;strong&gt;Master Flavor Map&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Decoder&lt;/strong&gt; is the recreating chef who:

&lt;ul&gt;
&lt;li&gt;Starts with a &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt; note,&lt;/li&gt;
&lt;li&gt;Can only look at what they’ve &lt;strong&gt;already cooked&lt;/strong&gt; (no peeking ahead!),&lt;/li&gt;
&lt;li&gt;And keeps glancing at the &lt;strong&gt;Flavor Map&lt;/strong&gt; to decide the next ingredient.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Finally, a &lt;strong&gt;pantry assistant&lt;/strong&gt; (Linear + Softmax) picks the most likely Hindi word (ingredient) for each step.&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;This is exactly how the Transformer translates &lt;em&gt;"How are you?"&lt;/em&gt; → &lt;em&gt;"तुम कैसे हो?"&lt;/em&gt;—one smart, attentive step at a time!&lt;/p&gt;

&lt;h3&gt;
  
  
  High-Level Block Diagram
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cq5zt3is2ggfsrcmd40.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cq5zt3is2ggfsrcmd40.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This shows the big picture. Now, let's open up these big blocks and see the smaller, powerful layers inside!&lt;/p&gt;




&lt;h2&gt;
  
  
  🏗️ Inside the Input Block (Encoder Input)
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Input Block&lt;/strong&gt; is the very first step on both the &lt;strong&gt;Encoder&lt;/strong&gt; and &lt;strong&gt;Decoder&lt;/strong&gt; side of the Transformer. It takes the original words and prepares them for the attention layers.&lt;/p&gt;

&lt;p&gt;Let’s follow the flow in the diagram — from bottom to top — to see exactly how the input sentence gets ready for the Transformer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnz01bbtumh5vwwwo8usq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnz01bbtumh5vwwwo8usq.png" alt=" " width="800" height="597"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Tokenizer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The sentence &lt;code&gt;"How are you"&lt;/code&gt; goes into the &lt;strong&gt;Tokenizer&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;It breaks the sentence into individual words (or subwords):
→ &lt;code&gt;"How"&lt;/code&gt;, &lt;code&gt;"are"&lt;/code&gt;, &lt;code&gt;"you"&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj490qonpj5xarj2uillr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj490qonpj5xarj2uillr.png" alt=" " width="800" height="754"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Think of this like cutting a sandwich into pieces before eating it — the model works on one word at a time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 2: Embedding (512 dim)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxc15kj53dx6yp7eir544.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxc15kj53dx6yp7eir544.png" alt=" " width="800" height="216"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each word (&lt;code&gt;"How"&lt;/code&gt;, &lt;code&gt;"are"&lt;/code&gt;, &lt;code&gt;"you"&lt;/code&gt;) is sent to the &lt;strong&gt;Embedding&lt;/strong&gt; layer.&lt;/li&gt;
&lt;li&gt;This turns each word into a &lt;strong&gt;512-number list&lt;/strong&gt; (called a &lt;em&gt;vector&lt;/em&gt;).&lt;/li&gt;
&lt;li&gt;These are called &lt;strong&gt;Word Embeddings&lt;/strong&gt;: &lt;code&gt;E1&lt;/code&gt;, &lt;code&gt;E2&lt;/code&gt;, &lt;code&gt;E3&lt;/code&gt; (each is &lt;code&gt;(512,0)&lt;/code&gt; — meaning 512 numbers).&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ Example:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"How"&lt;/code&gt; → &lt;code&gt;E1&lt;/code&gt; = [0.2, -0.8, 0.9, ..., 0.1]&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"are"&lt;/code&gt; → &lt;code&gt;E2&lt;/code&gt; = [0.7, 0.3, -0.6, ..., 0.4]
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"you"&lt;/code&gt; → &lt;code&gt;E3&lt;/code&gt; = [-0.1, 0.9, 0.2, ..., -0.7]&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 3: Positional Embeddings
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1j33b7h91vfh67u41nen.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1j33b7h91vfh67u41nen.png" alt=" " width="800" height="199"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;At the same time, each word gets a &lt;strong&gt;Positional Embedding&lt;/strong&gt;: &lt;code&gt;P1&lt;/code&gt;, &lt;code&gt;P2&lt;/code&gt;, &lt;code&gt;P3&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;These are &lt;strong&gt;not learned&lt;/strong&gt; — they’re precomputed using special math (sine/cosine waves) so every position has a unique pattern.&lt;/li&gt;
&lt;li&gt;Why? So the model knows that &lt;code&gt;"How"&lt;/code&gt; is first, &lt;code&gt;"are"&lt;/code&gt; is second, &lt;code&gt;"you"&lt;/code&gt; is third.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🧩 Without this, &lt;code&gt;"Cat chases dog"&lt;/code&gt; and &lt;code&gt;"Dog chases cat"&lt;/code&gt; would look identical!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 4: Add Them Together → Positional Encoded Vectors
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;For each word, we &lt;strong&gt;add&lt;/strong&gt; its Word Embedding and Positional Embedding:

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;X1 = E1 + P1&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;X2 = E2 + P2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;X3 = E3 + P3&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;These final vectors — &lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt; — are called &lt;strong&gt;Positional Encoded Vectors&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each is still &lt;strong&gt;512 numbers&lt;/strong&gt; — but now they contain &lt;strong&gt;both meaning AND position&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🎯 This is the magic: the model now has all the info it needs to start paying attention!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  📌 What Happens Next?
&lt;/h3&gt;

&lt;p&gt;These &lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt; vectors are now ready to go into the &lt;strong&gt;first Encoder block&lt;/strong&gt; — where the real “attention” begins!&lt;/p&gt;




&lt;h2&gt;
  
  
  🧱 Inside the Encoder Block
&lt;/h2&gt;

&lt;p&gt;The Encoder takes your input sentence (like &lt;em&gt;"How are you?"&lt;/em&gt;) and turns it into a deep, contextual understanding of every word.&lt;br&gt;&lt;br&gt;
It does this in &lt;strong&gt;two main steps&lt;/strong&gt;: first, a &lt;strong&gt;Multi-Head Attention Block&lt;/strong&gt; lets each word understand its relationship to all others. Then, a &lt;strong&gt;Feed Forward Neural Network Block&lt;/strong&gt; refines that meaning further.&lt;br&gt;&lt;br&gt;
This whole process repeats 6 times — each time making the understanding richer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkp63e4yirisg466y2nil.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkp63e4yirisg466y2nil.png" alt=" " width="800" height="777"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s walk through one full Encoder block using your detailed diagram — from bottom to top.&lt;/p&gt;

&lt;h3&gt;
  
  
  ➡️ Step 1: Input — Positional Encoded Vectors (&lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Input shape: &lt;code&gt;(3, 512)&lt;/code&gt; → 3 words, each as a 512-number vector.&lt;/li&gt;
&lt;li&gt;These come from the &lt;strong&gt;Input Block&lt;/strong&gt; (after adding Word + Positional Embeddings).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟢 Step 2: Multi Head Attention
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foc3swh5xfv4asby1cm3t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foc3swh5xfv4asby1cm3t.png" alt=" " width="800" height="301"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each word (&lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt;) looks at all other words to understand context.&lt;/li&gt;
&lt;li&gt;Output: &lt;strong&gt;Contextual Embeddings&lt;/strong&gt; → &lt;code&gt;Z1&lt;/code&gt;, &lt;code&gt;Z2&lt;/code&gt;, &lt;code&gt;Z3&lt;/code&gt; (still &lt;code&gt;(3, 512)&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;This is where the model learns that &lt;code&gt;"you"&lt;/code&gt; should pay attention to &lt;code&gt;"How"&lt;/code&gt; and &lt;code&gt;"are"&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ➕ Step 3: Residual Connection + Layer Normalisation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmj5h73ryo1kso5g8yz2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmj5h73ryo1kso5g8yz2.png" alt=" " width="800" height="479"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add original input back:
&lt;code&gt;Z1' = Z1 + X1&lt;/code&gt;
&lt;code&gt;Z2' = Z2 + X2&lt;/code&gt;
&lt;code&gt;Z3' = Z3 + X3&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Apply &lt;strong&gt;Layer Normalisation&lt;/strong&gt; → &lt;code&gt;Z1norm&lt;/code&gt;, &lt;code&gt;Z2norm&lt;/code&gt;, &lt;code&gt;Z3norm&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ This helps the model train better — keeps information flowing without getting lost.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  🟣 Step 4: Feed Forward Neural Network (FFNN) Block
&lt;/h3&gt;

&lt;p&gt;This is where each word gets its own private “thinking room”:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx13hthmdodrcwedoe4qa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx13hthmdodrcwedoe4qa.png" alt=" " width="800" height="737"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  A. First Linear Layer + ReLU
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Input: &lt;code&gt;Z1norm&lt;/code&gt;, &lt;code&gt;Z2norm&lt;/code&gt;, &lt;code&gt;Z3norm&lt;/code&gt; → &lt;code&gt;(3, 512)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Multiply by weight matrix &lt;code&gt;W1&lt;/code&gt; (size &lt;code&gt;512 × 2048&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Add bias &lt;code&gt;B1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Apply &lt;strong&gt;ReLU&lt;/strong&gt; → adds non-linearity → output shape: &lt;code&gt;(3, 2048)&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  B. Second Linear Layer
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Multiply by weight matrix &lt;code&gt;W2&lt;/code&gt; (size &lt;code&gt;2048 × 512&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Add bias &lt;code&gt;B2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Output: &lt;code&gt;Y1&lt;/code&gt;, &lt;code&gt;Y2&lt;/code&gt;, &lt;code&gt;Y3&lt;/code&gt; → &lt;code&gt;(3, 512)&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Think of this as a small brain for each word — refining its meaning after the group discussion.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  ➕ Step 5: Final Residual + Layer Normalisation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1b9xht77acinqutpft2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1b9xht77acinqutpft2.png" alt=" " width="800" height="535"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add the input (&lt;code&gt;Z1norm&lt;/code&gt;, &lt;code&gt;Z2norm&lt;/code&gt;, &lt;code&gt;Z3norm&lt;/code&gt;) back to the FFN output:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;Y1' = Y1 + Z1norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Y2' = Y2 + Z2norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Y3' = Y3 + Z3norm&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Then apply &lt;strong&gt;Layer Normalisation&lt;/strong&gt; → &lt;code&gt;Y1norm&lt;/code&gt;, &lt;code&gt;Y2norm&lt;/code&gt;, &lt;code&gt;Y3norm&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These become the &lt;strong&gt;final output of one Encoder block&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📌 &lt;strong&gt;Important&lt;/strong&gt;: In the original &lt;em&gt;“Attention Is All You Need”&lt;/em&gt; paper, this entire Encoder block is &lt;strong&gt;repeated 6 times in a chain&lt;/strong&gt;:&lt;/p&gt;


&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input → Encoder 1 → Output 1 → Encoder 2 → Output 2 → Encoder 3 → Output 3 → Encoder 4 → Output 4 → Encoder 5 → Output 5 → Encoder 6 → Final Encoder Output
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Each encoder takes the output of the previous one as its input, building deeper and richer understanding at every stage.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧱 Inside the Decoder Input Block
&lt;/h2&gt;

&lt;p&gt;Now that the Encoder has finished its job, it’s time for the &lt;strong&gt;Decoder&lt;/strong&gt; to start writing the output sentence — but not quite yet. First, it needs its own special input.&lt;/p&gt;

&lt;p&gt;This is where the &lt;strong&gt;Decoder Input Block&lt;/strong&gt; comes in — and as diagram shows, it’s almost identical to the Encoder Input Block… with one &lt;em&gt;very important twist&lt;/em&gt;: the &lt;strong&gt;Right Shift&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaqr8mkjbbob7gyh6n2g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaqr8mkjbbob7gyh6n2g.png" alt=" " width="800" height="675"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Decoder Input Block prepares the &lt;em&gt;target&lt;/em&gt; sentence (e.g., &lt;code&gt;"तुम कैसे हो"&lt;/code&gt;) so the Decoder can learn to generate it one word at a time — without cheating by looking ahead.&lt;br&gt;&lt;br&gt;
It does this by adding a &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt; token and &lt;strong&gt;shifting everything right&lt;/strong&gt;, so each step only sees what came before.&lt;/p&gt;

&lt;h3&gt;
  
  
  ➡️ Step 1: Right Shift
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc23op26t199cmu5khk7q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc23op26t199cmu5khk7q.png" alt=" " width="800" height="736"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start with the target sentence: &lt;code&gt;"तुम कैसे हो"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Add a special &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt; token at the &lt;strong&gt;beginning&lt;/strong&gt;:
→ &lt;code&gt;"&amp;lt;start&amp;gt; तुम कैसे हो"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Then &lt;strong&gt;shift the entire sequence one position to the right&lt;/strong&gt;.
This means the decoder will &lt;strong&gt;never see the word it’s supposed to predict&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a new input sequence for the decoder:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Position&lt;/th&gt;
&lt;th&gt;1&lt;/th&gt;
&lt;th&gt;2&lt;/th&gt;
&lt;th&gt;3&lt;/th&gt;
&lt;th&gt;4&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Decoder Input&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;तुम&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;कैसे&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;हो&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Target Output&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;तुम&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;कैसे&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;हो&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;end&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Why?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
During training, the Decoder uses this shifted input to predict the &lt;strong&gt;next word&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To predict &lt;code&gt;"तुम"&lt;/code&gt;, it only sees &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;To predict &lt;code&gt;"कैसे"&lt;/code&gt;, it sees &lt;code&gt;&amp;lt;start&amp;gt; + तुम&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;It &lt;strong&gt;never sees "हो"&lt;/strong&gt; when predicting &lt;code&gt;"कैसे"&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This forces the model to generate text &lt;strong&gt;causally&lt;/strong&gt;—just like writing a sentence from left to right.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  ➡️ Step 2: Tokenizer
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdsw8hyt1a3ff6zygrf2j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdsw8hyt1a3ff6zygrf2j.png" alt=" " width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The shifted sequence (&lt;code&gt;&amp;lt;start&amp;gt; तुम कैसे हो&lt;/code&gt;) goes into the &lt;strong&gt;Tokenizer&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;It breaks it into individual tokens:
→ &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt;, &lt;code&gt;तुम&lt;/code&gt;, &lt;code&gt;कैसे&lt;/code&gt;, &lt;code&gt;हो&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ➡️ Step 3: Embedding (512 dim)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tdpr8sweom6siv4yevt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tdpr8sweom6siv4yevt.png" alt=" " width="800" height="190"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each token gets turned into a &lt;strong&gt;512-number vector&lt;/strong&gt; via the &lt;strong&gt;Embedding&lt;/strong&gt; layer.&lt;/li&gt;
&lt;li&gt;These are called &lt;strong&gt;Word Embeddings&lt;/strong&gt;: &lt;code&gt;E1&lt;/code&gt;, &lt;code&gt;E2&lt;/code&gt;, &lt;code&gt;E3&lt;/code&gt;, &lt;code&gt;E4&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ Example:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt; → &lt;code&gt;E1&lt;/code&gt; = [0.1, -0.9, 0.3, ..., 0.7]
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;तुम&lt;/code&gt; → &lt;code&gt;E2&lt;/code&gt; = [0.8, 0.2, -0.6, ..., 0.1]
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;कैसे&lt;/code&gt; → &lt;code&gt;E3&lt;/code&gt; = [-0.4, 0.9, 0.5, ..., -0.3]
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;हो&lt;/code&gt; → &lt;code&gt;E4&lt;/code&gt; = [0.6, -0.1, 0.8, ..., 0.2]&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  ➡️ Step 4: Positional Embeddings
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixmddskrpu8wdq62g6bg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixmddskrpu8wdq62g6bg.png" alt=" " width="800" height="171"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Just like the Encoder, each token also gets a &lt;strong&gt;Positional Embedding&lt;/strong&gt;: &lt;code&gt;P1&lt;/code&gt;, &lt;code&gt;P2&lt;/code&gt;, &lt;code&gt;P3&lt;/code&gt;, &lt;code&gt;P4&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;These are precomputed (using sine/cosine waves) to tell the model the position of each token.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ➡️ Step 5: Add Them Together → Positional Encoded Vectors
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;For each token, we &lt;strong&gt;add&lt;/strong&gt; its Word Embedding and Positional Embedding:&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;X1 = E1 + P1&lt;/code&gt; → for &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;X2 = E2 + P2&lt;/code&gt; → for &lt;code&gt;तुम&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;X3 = E3 + P3&lt;/code&gt; → for &lt;code&gt;कैसे&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;X4 = E4 + P4&lt;/code&gt; → for &lt;code&gt;हो&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These final vectors — &lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt;, &lt;code&gt;X4&lt;/code&gt; — are called &lt;strong&gt;Positional Encoded Vectors&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each is still &lt;strong&gt;512 numbers&lt;/strong&gt; — but now they contain &lt;strong&gt;both meaning AND position&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🎯 This is the magic: the Decoder now has all the info it needs to start generating — one word at a time, without peeking ahead!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  📌 What Happens Next?
&lt;/h3&gt;

&lt;p&gt;These &lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt;, &lt;code&gt;X4&lt;/code&gt; vectors are now ready to enter the &lt;strong&gt;first Decoder Block&lt;/strong&gt; — where they’ll meet the Encoder’s “thought” through &lt;strong&gt;Cross-Attention&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧱 Inside the Decoder Block
&lt;/h2&gt;

&lt;p&gt;Now that we have our &lt;strong&gt;Positional Encoded Vectors&lt;/strong&gt; (&lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt;, &lt;code&gt;X4&lt;/code&gt;) from the Decoder Input Block, they’re ready to enter the &lt;strong&gt;Decoder Block&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This block has &lt;strong&gt;three main parts&lt;/strong&gt;, stacked one after another:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Masked Self-Attention Block&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cross Attention Block&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Feed Forward Neural Network Block&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuggai9ax9zqizapr9449.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuggai9ax9zqizapr9449.png" alt=" " width="800" height="933"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And just like the Encoder, this whole structure is repeated &lt;strong&gt;6 times&lt;/strong&gt; (Decoder 1 → Decoder 6).&lt;/p&gt;

&lt;p&gt;Let’s walk through one full Decoder block — from bottom to top — using detailed diagram.&lt;/p&gt;

&lt;h3&gt;
  
  
  ➡️ Step 1: Input — Positional Encoded Vectors (&lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt;, &lt;code&gt;X4&lt;/code&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Input shape: &lt;code&gt;(4, 512)&lt;/code&gt; → 4 words (including &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt;), each as a 512-number vector.&lt;/li&gt;
&lt;li&gt;These come from the &lt;strong&gt;Decoder Input Block&lt;/strong&gt; (after adding Word + Positional Embeddings).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟢 Step 2: Masked Multi Head Attention
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Each word (&lt;code&gt;X1&lt;/code&gt;, &lt;code&gt;X2&lt;/code&gt;, &lt;code&gt;X3&lt;/code&gt;, &lt;code&gt;X4&lt;/code&gt;) looks at &lt;strong&gt;all previous words&lt;/strong&gt; — but &lt;strong&gt;not future ones&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Why? Because during training, the model must predict the next word without seeing it!&lt;/li&gt;
&lt;li&gt;This is called &lt;strong&gt;Masked Self-Attention&lt;/strong&gt; — the “mask” blocks out future positions.&lt;/li&gt;
&lt;li&gt;Output: &lt;strong&gt;Contextual Embeddings&lt;/strong&gt; → &lt;code&gt;Z1&lt;/code&gt;, &lt;code&gt;Z2&lt;/code&gt;, &lt;code&gt;Z3&lt;/code&gt;, &lt;code&gt;Z4&lt;/code&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy2dfacxk7q8iul9zrbjc.png" alt=" " width="800" height="249"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Example:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When predicting &lt;code&gt;"कैसे"&lt;/code&gt;, it can see &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt; and &lt;code&gt;तुम&lt;/code&gt; — but not &lt;code&gt;हो&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  ➕ Step 3: Residual Connection + Layer Normalisation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Add original input back:
&lt;code&gt;Z1' = Z1 + X1&lt;/code&gt;
&lt;code&gt;Z2' = Z2 + X2&lt;/code&gt;
&lt;code&gt;Z3' = Z3 + X3&lt;/code&gt;
&lt;code&gt;Z4' = Z4 + X4&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Apply &lt;strong&gt;Layer Normalisation&lt;/strong&gt; → &lt;code&gt;Z1norm&lt;/code&gt;, &lt;code&gt;Z2norm&lt;/code&gt;, &lt;code&gt;Z3norm&lt;/code&gt;, &lt;code&gt;Z4norm&lt;/code&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frv0ge552rsbv6jcvk3i6.png" alt=" " width="800" height="398"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ This helps the model train better — keeps information flowing without getting lost.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  🟠 Step 4: Cross Attention
&lt;/h3&gt;

&lt;p&gt;This is where the magic happens — the &lt;strong&gt;Decoder talks to the Encoder&lt;/strong&gt;!&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbe9pptmk76hsi80gdr1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbe9pptmk76hsi80gdr1.png" alt=" " width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Decoder takes its own normalized vectors (&lt;code&gt;Z1norm&lt;/code&gt;, &lt;code&gt;Z2norm&lt;/code&gt;, &lt;code&gt;Z3norm&lt;/code&gt;, &lt;code&gt;Z4norm&lt;/code&gt;) as &lt;strong&gt;Queries&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;It uses the &lt;strong&gt;Encoder’s final output&lt;/strong&gt; (from Encoder 6) as &lt;strong&gt;Keys and Values&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;This lets the Decoder focus on the most relevant parts of the input sentence.

&lt;ul&gt;
&lt;li&gt;For example: when generating &lt;code&gt;"हो"&lt;/code&gt;, it might look back at the Encoder’s understanding of &lt;code&gt;"you"&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Output: &lt;strong&gt;Cross-Attention Embeddings&lt;/strong&gt; → &lt;code&gt;Zc1&lt;/code&gt;, &lt;code&gt;Zc2&lt;/code&gt;, &lt;code&gt;Zc3&lt;/code&gt;, &lt;code&gt;Zc4&lt;/code&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Think of this as the Decoder asking: &lt;em&gt;“Hey Encoder — what part of the English sentence should I focus on right now?”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  ➕ Step 5: Residual Connection + Layer Normalisation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4d2p9fzr0glgj1wbuu0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4d2p9fzr0glgj1wbuu0.png" alt=" " width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add the input (&lt;code&gt;Z1norm&lt;/code&gt;, &lt;code&gt;Z2norm&lt;/code&gt;, &lt;code&gt;Z3norm&lt;/code&gt;, &lt;code&gt;Z4norm&lt;/code&gt;) back to the cross-attention output:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;Zc1' = Zc1 + Z1norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Zc2' = Zc2 + Z2norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Zc3' = Zc3 + Z3norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Zc4' = Zc4 + Z4norm&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apply &lt;strong&gt;Layer Normalisation&lt;/strong&gt; → &lt;code&gt;Zc1norm&lt;/code&gt;, &lt;code&gt;Zc2norm&lt;/code&gt;, &lt;code&gt;Zc3norm&lt;/code&gt;, &lt;code&gt;Zc4norm&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟣 Step 6: Feed Forward Neural Network (FFNN) Block
&lt;/h3&gt;

&lt;p&gt;This is where each word gets its own private “thinking room” — same as in the Encoder:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjiqa3z5x9slltpuix16.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjiqa3z5x9slltpuix16.png" alt=" " width="800" height="722"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  A. First Linear Layer + ReLU
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Input: &lt;code&gt;Zc1norm&lt;/code&gt;, &lt;code&gt;Zc2norm&lt;/code&gt;, &lt;code&gt;Zc3norm&lt;/code&gt;, &lt;code&gt;Zc4norm&lt;/code&gt; → &lt;code&gt;(4, 512)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Multiply by weight matrix &lt;code&gt;W1&lt;/code&gt; (size &lt;code&gt;512 × 2048&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Add bias &lt;code&gt;B1&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Apply &lt;strong&gt;ReLU&lt;/strong&gt; → adds non-linearity → output shape: &lt;code&gt;(4, 2048)&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  B. Second Linear Layer
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Multiply by weight matrix &lt;code&gt;W2&lt;/code&gt; (size &lt;code&gt;2048 × 512&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Add bias &lt;code&gt;B2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Output: &lt;code&gt;Y1&lt;/code&gt;, &lt;code&gt;Y2&lt;/code&gt;, &lt;code&gt;Y3&lt;/code&gt;, &lt;code&gt;Y4&lt;/code&gt; → &lt;code&gt;(4, 512)&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Think of this as refining each word’s meaning after listening to both itself (self-attention) and the Encoder (cross-attention).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  ➕ Step 7: Final Residual + Layer Normalisation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfbzj0h5hm3c2i80p8z4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfbzj0h5hm3c2i80p8z4.png" alt=" " width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add the input (&lt;code&gt;Zc1norm&lt;/code&gt;, &lt;code&gt;Zc2norm&lt;/code&gt;, &lt;code&gt;Zc3norm&lt;/code&gt;, &lt;code&gt;Zc4norm&lt;/code&gt;) back to the FFN output:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;Y1' = Y1 + Zc1norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Y2' = Y2 + Zc2norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Y3' = Y3 + Zc3norm&lt;/code&gt;&lt;br&gt;
 &lt;code&gt;Y4' = Y4 + Zc4norm&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apply &lt;strong&gt;Layer Normalisation&lt;/strong&gt; → &lt;code&gt;Y1norm&lt;/code&gt;, &lt;code&gt;Y2norm&lt;/code&gt;, &lt;code&gt;Y3norm&lt;/code&gt;, &lt;code&gt;Y4norm&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These become the &lt;strong&gt;final output of one Decoder block&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔄 Repeat 6 Times
&lt;/h3&gt;

&lt;p&gt;This entire process — Masked Self-Attention → Residual → Norm → Cross-Attention → Residual → Norm → FFN → Residual → Norm — happens &lt;strong&gt;6 times in a row&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;After Decoder 6, the model has a &lt;strong&gt;rich, context-aware understanding&lt;/strong&gt; of what to generate next — ready for the &lt;strong&gt;Final Output Block&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🎯 Final Output Block: Turning Numbers into Words
&lt;/h2&gt;

&lt;p&gt;After the last Decoder block (Decoder 6), we have four final vectors: &lt;code&gt;Y1fnorm&lt;/code&gt;, &lt;code&gt;Y2fnorm&lt;/code&gt;, &lt;code&gt;Y3fnorm&lt;/code&gt;, &lt;code&gt;Y4fnorm&lt;/code&gt; — each of shape &lt;code&gt;(512,0)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4d3hp9yupbgubyr3nzxx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4d3hp9yupbgubyr3nzxx.png" alt=" " width="800" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These vectors are the model’s “best guess” for what each word in the output sentence should be. But they’re still just numbers. To turn them into actual words like &lt;code&gt;"तुम"&lt;/code&gt;, &lt;code&gt;"कैसे"&lt;/code&gt;, &lt;code&gt;"हो"&lt;/code&gt;, and &lt;code&gt;&amp;lt;end&amp;gt;&lt;/code&gt;, we need the &lt;strong&gt;Final Output Block&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This block is &lt;strong&gt;repeated once for each output position&lt;/strong&gt; — so there are 4 identical blocks here, one for each word.&lt;/p&gt;

&lt;p&gt;Let’s walk through the &lt;strong&gt;first block&lt;/strong&gt; — the one that predicts the very first word: &lt;code&gt;"तुम"&lt;/code&gt;.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feoz1zq2gv09od5tr5d4u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feoz1zq2gv09od5tr5d4u.png" alt=" " width="800" height="882"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ➡️ Step 1: Input — &lt;code&gt;Y1fnorm&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;This is the final vector from Decoder 6 for the first position.&lt;/li&gt;
&lt;li&gt;Shape: &lt;code&gt;(512,0)&lt;/code&gt; → 512 numbers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟣 Step 2: Linear Layer (512 → V)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The vector goes into a &lt;strong&gt;linear layer&lt;/strong&gt; with weights of size &lt;code&gt;512 × V&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;V&lt;/code&gt; = number of unique words in the output vocabulary (e.g., all Hindi words + &lt;code&gt;&amp;lt;start&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;end&amp;gt;&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Output: &lt;code&gt;V values&lt;/code&gt; — one score for every possible word.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Think of this as a giant lookup table: it asks, &lt;em&gt;“Given these 512 numbers, how likely is each word to be the next one?”&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frk8r6mpf9tfq26oqhrh2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frk8r6mpf9tfq26oqhrh2.png" alt=" " width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  🟠 Step 3: Softmax
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;V values&lt;/code&gt; go through a &lt;strong&gt;softmax&lt;/strong&gt; function.&lt;/li&gt;
&lt;li&gt;This turns the scores into &lt;strong&gt;probabilities&lt;/strong&gt; — adding up to 1.0.&lt;/li&gt;
&lt;li&gt;Output: &lt;code&gt;V probability values&lt;/code&gt; — e.g., 90% chance of &lt;code&gt;"तुम"&lt;/code&gt;, 5% of &lt;code&gt;"कैसे"&lt;/code&gt;, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟢 Step 4: Normalisation
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cirl48in7yvzjxfygun.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cirl48in7yvzjxfygun.png" alt=" " width="800" height="301"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;Normalisation&lt;/strong&gt; step is applied — this ensures the probabilities are smooth and well-scaled.&lt;/li&gt;
&lt;li&gt;In practice, this is often part of the softmax or a small post-processing step.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🎯 Step 5: Return Highest Probability Value
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzlxf4gyqdgmwyeh09uxv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzlxf4gyqdgmwyeh09uxv.png" alt=" " width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The model picks the word with the &lt;strong&gt;highest probability&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;For the first position → it picks &lt;code&gt;"तुम"&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;✅ This is how the Transformer generates its first word!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  🔁 Repeat for All Positions
&lt;/h3&gt;

&lt;p&gt;The same process happens for the other three positions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Position 2 → &lt;code&gt;Y2fnorm&lt;/code&gt; → predicts &lt;code&gt;"कैसे"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Position 3 → &lt;code&gt;Y3fnorm&lt;/code&gt; → predicts &lt;code&gt;"हो"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Position 4 → &lt;code&gt;Y4fnorm&lt;/code&gt; → predicts &lt;code&gt;&amp;lt;end&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each block is identical — only the input vector (&lt;code&gt;Y1fnorm&lt;/code&gt;, &lt;code&gt;Y2fnorm&lt;/code&gt;, etc.) changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  📌 Why This Works
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The model doesn’t generate all words at once — it does one at a time.&lt;/li&gt;
&lt;li&gt;Each prediction is based on the full context built by the Encoder and Decoder.&lt;/li&gt;
&lt;li&gt;The final linear + softmax layer is like a “vocabulary selector” — turning abstract numbers into real words.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the &lt;strong&gt;final step&lt;/strong&gt; in the Transformer — where numbers become language!&lt;/p&gt;




&lt;h2&gt;
  
  
  🌟 Conclusion: The Transformer, Demystified
&lt;/h2&gt;

&lt;p&gt;You’ve just walked through the entire Transformer — from raw words to fluent translation — one block at a time.&lt;/p&gt;

&lt;p&gt;you can check the whole diagram here : &lt;a href="https://drive.google.com/file/d/1lz68fKBnUtsqi9_9q_7J6MrikSu2oA8e/view?usp=sharing" rel="noopener noreferrer"&gt;https://drive.google.com/file/d/1lz68fKBnUtsqi9_9q_7J6MrikSu2oA8e/view?usp=sharing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No magic. No mystery. Just &lt;strong&gt;smart design&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Attention&lt;/strong&gt; that sees relationships,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Positional codes&lt;/strong&gt; that preserve order,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Residual connections&lt;/strong&gt; that keep learning stable,&lt;/li&gt;
&lt;li&gt;And &lt;strong&gt;parallel processing&lt;/strong&gt; that makes it fast.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What started as a sentence — &lt;em&gt;"How are you?"&lt;/em&gt; — became numbers, then context, then meaning, and finally: &lt;em&gt;"तुम कैसे हो?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And the best part?&lt;br&gt;&lt;br&gt;
&lt;strong&gt;You now understand how it works — not just at a high level, but deep down to the vectors, layers, and shapes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Transformer isn’t just a model. It’s the foundation of modern AI — from translation and chatbots to code generation and beyond.&lt;/p&gt;

&lt;p&gt;And you?&lt;br&gt;&lt;br&gt;
You didn’t just read about it.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;You followed the data all the way through.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Go ahead — share what you’ve learned.&lt;br&gt;&lt;br&gt;
Because now, you truly &lt;em&gt;see&lt;/em&gt; the machine behind the magic. 💫&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>llm</category>
    </item>
    <item>
      <title>A Beginner's GAN Adventure with Digits</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Tue, 07 Oct 2025 20:52:23 +0000</pubDate>
      <link>https://forem.com/anikchand461/a-beginners-gan-adventure-with-digits-54g4</link>
      <guid>https://forem.com/anikchand461/a-beginners-gan-adventure-with-digits-54g4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;It was a rainy afternoon in September. The view from my window was all gray and &lt;em&gt;blurry&lt;/em&gt;. That reminded me of the first images my GAN made – just fuzzy chaos on the screen. 🌧️&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I'd been learning Deep Learning for months. I wanted to create something new with code. Not just numbers, but something that &lt;em&gt;looks real&lt;/em&gt;.  &lt;/p&gt;

&lt;p&gt;So, I tried &lt;strong&gt;Generative Adversarial Networks&lt;/strong&gt;. I built it from scratch with &lt;strong&gt;Keras&lt;/strong&gt; and &lt;strong&gt;TensorFlow&lt;/strong&gt;, taking help from ChatGPT. 💻&lt;/p&gt;

&lt;p&gt;The dataset? &lt;strong&gt;MNIST&lt;/strong&gt;. It has 70,000 handwritten digits. They look like quick notes from people long ago – scribbled 7s and curvy 8s. 📝&lt;/p&gt;

&lt;p&gt;Think of two neural networks working &lt;em&gt;against each other&lt;/em&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Generator&lt;/strong&gt; takes random noise – just a bunch of random numbers. It tries to turn that into a digit image. At first, it makes blurry shapes. You might guess it's a 3... or something else. 🔄
&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Discriminator&lt;/strong&gt; checks the images. It knows real MNIST images. It says &lt;em&gt;"fake"&lt;/em&gt; to the bad ones and &lt;em&gt;"real"&lt;/em&gt; to the good ones. 👮‍♂️
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxwilx7exvjy0eb0kckc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxwilx7exvjy0eb0kckc.png" alt=" " width="800" height="504"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Source: &lt;a href="https://dzone.com/articles/working-principles-of-generative-adversarial-netwo" rel="noopener noreferrer"&gt;DZone Article on GAN Principles&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Think of GANs like a counterfeiter (Generator) trying to make fake money that fools the police (Discriminator). The bank provides real money for training. Over time, the fakes get better! 💸&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Architecture of A GAN network : &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftmqobuggadvxtsu07iqa.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftmqobuggadvxtsu07iqa.jpeg" alt=" " width="800" height="271"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Source: &lt;a href="https://jonathan-hui.medium.com/gan-why-it-is-so-hard-to-train-generative-advisory-networks-819a86b3750b" rel="noopener noreferrer"&gt;Jonathan Hui on Medium&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;The cool part? They learn &lt;em&gt;together&lt;/em&gt;. The Generator gets better at tricking the Discriminator. The Discriminator gets better at spotting fakes. After many training steps, the fake digits look real. Like they were drawn by hand. 🎨  &lt;/p&gt;

&lt;p&gt;I did this project as an experiment. I wanted to understand how GANs work. It was fun to see it come together. Next, I plan to use this to make &lt;em&gt;realistic human faces&lt;/em&gt;. 😎  &lt;/p&gt;




&lt;h2&gt;
  
  
  Building the Networks 🛠️
&lt;/h2&gt;

&lt;p&gt;I used Jupyter notebook to build it. The Generator starts with 100 random numbers. It uses dense layers and LeakyReLU to shape them. Then it turns them into a 28x28 image. It's like building a picture from nothing. 🖼️&lt;/p&gt;

&lt;p&gt;The Discriminator takes 28x28 images. It uses Conv2D layers to look for patterns. It ends with a yes or no: real or fake.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parts&lt;/th&gt;
&lt;th&gt;Generator&lt;/th&gt;
&lt;th&gt;Discriminator&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;100 random numbers&lt;/td&gt;
&lt;td&gt;28x28 image&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Main Layers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dense then reshape&lt;/td&gt;
&lt;td&gt;Conv2D then flatten&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Activations&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LeakyReLU, Tanh&lt;/td&gt;
&lt;td&gt;LeakyReLU, Sigmoid&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fake image&lt;/td&gt;
&lt;td&gt;Real (1) or Fake (0)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At the last of the blog I have explained more about the Discriminator and Generator network which is have used in my code.&lt;/p&gt;

&lt;p&gt;Training took time on my GPU. I saved the models so I could start again if needed. I also saved images every few steps to see progress. Later, I made a GIF from them. And I zipped all the images together. ⏳&lt;/p&gt;




&lt;h2&gt;
  
  
  Seeing the Digits Improve 📈
&lt;/h2&gt;

&lt;p&gt;Around step 5, I saw a shaky 4 appear. That was exciting. By step 25, the digits looked good. 1s had straight lines. 9s had curves. At step 100, the Discriminator could not tell fakes from real very well. That meant success.&lt;/p&gt;

&lt;p&gt;Here are some images from training:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pure Noise&lt;/th&gt;
&lt;th&gt;Epoch 1&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9dazbw679f6zxrjo50wm.png" alt="Pure Noise" width="369" height="369"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjwa48mv4t5l49bg3b6f.png" alt="Epoch 1" width="400" height="400"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;From total randomness to the first blurry hints of digits – like fog lifting just a bit.&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Epoch 2&lt;/th&gt;
&lt;th&gt;Epoch 3&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd6idv0edw87dz5ukjjer.png" alt="Epoch 2" width="400" height="400"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtcq5f8x02wlemife8la.png" alt="Epoch 3" width="400" height="400"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Faint shapes emerging... is that a 6? The Generator is starting to get the idea.&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Epoch 9&lt;/th&gt;
&lt;th&gt;Epoch 17&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnea0em5d56mw5ewkjuh7.png" alt="Epoch 9" width="400" height="400"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5wvd4ko62hvprghj5unf.png" alt="Epoch 17" width="400" height="400"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Things are sharpening up! Each digit feels a little more like real handwriting, with its own quirks.&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Epoch 25&lt;/th&gt;
&lt;th&gt;Epoch 50&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9d6zdh118znr3b8k662.png" alt="Epoch 25" width="400" height="400"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ky91fwn23k6ft9rfvva.png" alt="Epoch 50" width="400" height="400"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Confidence building – straight lines for 1s, smooth curves for 9s. The duel is heating up.&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Epoch 75&lt;/th&gt;
&lt;th&gt;Epoch 100&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgb8eusx29y6mi5kc3mwp.png" alt="Epoch 75" width="400" height="400"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5lazljql0gjbs6ezkvo.png" alt="Epoch 100" width="400" height="400"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Nearly there! These fakes could pass for real MNIST scribbles. Magic in the making. ✨&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here is a GIF of 16 digits improving over time.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdoynmm4or7r8iy6i3r9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdoynmm4or7r8iy6i3r9.gif" alt="gif" width="400" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;It shows how random noise turns into clear digits in 10 seconds.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  GAN Architecture in My Project 🔧
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Discriminator Architecture
&lt;/h3&gt;

&lt;p&gt;The Discriminator network processes 28x28 grayscale images and outputs a probability (0 for fake, 1 for real). Here's the layer-by-layer breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer Name&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Input Shape&lt;/th&gt;
&lt;th&gt;Output Shape&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;conv2d_6&lt;/td&gt;
&lt;td&gt;Conv2D&lt;/td&gt;
&lt;td&gt;(None, 28, 28, 1)&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;leaky_re_lu_30&lt;/td&gt;
&lt;td&gt;LeakyReLU&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;dropout_6&lt;/td&gt;
&lt;td&gt;Dropout&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;conv2d_7&lt;/td&gt;
&lt;td&gt;Conv2D&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;leaky_re_lu_31&lt;/td&gt;
&lt;td&gt;LeakyReLU&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;dropout_7&lt;/td&gt;
&lt;td&gt;Dropout&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;flatten_3&lt;/td&gt;
&lt;td&gt;Flatten&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;td&gt;(None, 6272)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;dense_11&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;(None, 6272)&lt;/td&gt;
&lt;td&gt;(None, 1)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This architecture uses convolutional layers for feature extraction, dropout for regularization to prevent overfitting, and LeakyReLU activations to maintain gradient flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generator Architecture
&lt;/h3&gt;

&lt;p&gt;The Generator network starts with random noise input and progressively upsamples it to produce 28x28 grayscale digit images. Here's the layer-by-layer breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer Name&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Input Shape&lt;/th&gt;
&lt;th&gt;Output Shape&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;dense_9&lt;/td&gt;
&lt;td&gt;Dense&lt;/td&gt;
&lt;td&gt;(None, 100)&lt;/td&gt;
&lt;td&gt;(None, 1254)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;batch_normalization_18&lt;/td&gt;
&lt;td&gt;BatchNormalization&lt;/td&gt;
&lt;td&gt;(None, 1254)&lt;/td&gt;
&lt;td&gt;(None, 1254)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;leaky_re_lu_24&lt;/td&gt;
&lt;td&gt;LeakyReLU&lt;/td&gt;
&lt;td&gt;(None, 1254)&lt;/td&gt;
&lt;td&gt;(None, 1254)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reshape_6&lt;/td&gt;
&lt;td&gt;Reshape&lt;/td&gt;
&lt;td&gt;(None, 1254)&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 256)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;conv2d_transpose_18&lt;/td&gt;
&lt;td&gt;Conv2DTranspose&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 256)&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;batch_normalization_19&lt;/td&gt;
&lt;td&gt;BatchNormalization&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;leaky_re_lu_25&lt;/td&gt;
&lt;td&gt;LeakyReLU&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;conv2d_transpose_19&lt;/td&gt;
&lt;td&gt;Conv2DTranspose&lt;/td&gt;
&lt;td&gt;(None, 7, 7, 128)&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;batch_normalization_20&lt;/td&gt;
&lt;td&gt;BatchNormalization&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;leaky_re_lu_26&lt;/td&gt;
&lt;td&gt;LeakyReLU&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;conv2d_transpose_20&lt;/td&gt;
&lt;td&gt;Conv2DTranspose&lt;/td&gt;
&lt;td&gt;(None, 14, 14, 64)&lt;/td&gt;
&lt;td&gt;(None, 28, 28, 1)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This architecture uses transposed convolutions for upsampling, batch normalization for stability, and LeakyReLU activations to prevent vanishing gradients.&lt;/p&gt;




&lt;p&gt;When the rain stopped, I looked at all the images. This project taught me a lot about GANs. It showed how trial and error leads to something new. Now, I want to try better ways to check quality, like FID scores. Maybe use more layers. Or try it on color images from CIFAR-10.&lt;/p&gt;

&lt;p&gt;I also want to make Conditional GANs. That way, I can make a specific digit, like just 7s.&lt;/p&gt;

&lt;p&gt;Check the code on &lt;a href="https://github.com/anikchand461/GAN-MNIST" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Try it yourself. What GAN project have you done? Tell me in the comments.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>beginners</category>
      <category>python</category>
    </item>
    <item>
      <title>Sentiment Analysis, the Classical Way — No Deep Learning</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Wed, 06 Aug 2025 14:12:30 +0000</pubDate>
      <link>https://forem.com/anikchand461/sentiment-analysis-the-classical-way-no-deep-learning-3lc6</link>
      <guid>https://forem.com/anikchand461/sentiment-analysis-the-classical-way-no-deep-learning-3lc6</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Can you get high accuracy in sentiment analysis without touching deep learning?”&lt;/em&gt;&lt;br&gt;
That's the question that sparked my curiosity — and led to a project that amazed me with its results.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In this blog, I’ll walk you through my journey of building a sentiment analysis system &lt;strong&gt;using only Machine Learning&lt;/strong&gt; — no neural networks, no transformers, just classic ML — and still achieving an impressive &lt;strong&gt;accuracy of 89.12%&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  📌 Problem Statement
&lt;/h2&gt;

&lt;p&gt;The goal: &lt;strong&gt;Sentiment Analysis&lt;/strong&gt; — classifying text as either positive or negative.&lt;/p&gt;

&lt;p&gt;Rather than relying on RNNs, LSTMs, or BERT, I challenged myself to stay within the boundaries of &lt;strong&gt;classical machine learning algorithms&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧪 Dataset
&lt;/h2&gt;

&lt;p&gt;I used the &lt;strong&gt;IMDb movie reviews dataset&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;40,000 training reviews&lt;/li&gt;
&lt;li&gt;10,000 testing reviews&lt;/li&gt;
&lt;li&gt;Binary labels: positive / negative&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧹 Preprocessing Steps
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Lowercasing&lt;/strong&gt; text&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Removing HTML tags&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cleaning punctuation&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tokenization&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stopword removal&lt;/strong&gt; (using NLTK)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stemming&lt;/strong&gt; with Porter Stemmer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vectorization&lt;/strong&gt; using &lt;strong&gt;TF-IDF&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  ⚙️ Baseline Model: GaussianNB
&lt;/h2&gt;

&lt;p&gt;To begin, I tested the simplest model possible: &lt;strong&gt;Gaussian Naive Bayes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;📈 &lt;strong&gt;Accuracy&lt;/strong&gt;: &lt;code&gt;82.00%&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This gave me a quick baseline — but I knew I could push further.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔍 Model Experiments
&lt;/h2&gt;

&lt;p&gt;I tested multiple models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.naive_bayes&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MultinomialNB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BernoulliNB&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LogisticRegression&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RidgeClassifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SGDClassifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PassiveAggressiveClassifier&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.svm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LinearSVC&lt;/span&gt;

&lt;span class="n"&gt;clf2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MultinomialNB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BernoulliNB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LogisticRegression&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;saga&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_iter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf5&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LinearSVC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_iter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SGDClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;log_loss&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_iter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tol&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf7&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RidgeClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;solver&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf8&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PassiveAggressiveClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_iter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tol&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;early_stopping&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Best 4 accuracies - &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlh6epnyiprxu1bi88o4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlh6epnyiprxu1bi88o4.png" alt=" " width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🧪 GridSearchCV for Tuning
&lt;/h2&gt;

&lt;p&gt;I applied &lt;strong&gt;GridSearchCV&lt;/strong&gt; on the top 4 models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Logistic Regression
&lt;/span&gt;&lt;span class="n"&gt;param_grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;C&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;solver&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;saga&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;max_iter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="c1"&gt;# Best: {'C': 0.1, 'solver': 'saga'} =&amp;gt; 88.7%
&lt;/span&gt;
&lt;span class="c1"&gt;# LinearSVC
&lt;/span&gt;&lt;span class="n"&gt;param_grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;C&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;max_iter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="c1"&gt;# Best: {'C': 0.01} =&amp;gt; 88.645%
&lt;/span&gt;
&lt;span class="c1"&gt;# MultinomialNB
&lt;/span&gt;&lt;span class="n"&gt;param_grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;alpha&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;span class="c1"&gt;# Best: {'alpha': 1.0} =&amp;gt; 85.37%
&lt;/span&gt;
&lt;span class="c1"&gt;# SGDClassifier
&lt;/span&gt;&lt;span class="n"&gt;param_grid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;alpha&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.0001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;loss&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hinge&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;log_loss&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;penalty&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;l2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;l1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;elasticnet&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;max_iter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;# Best: {'alpha': 0.001, 'loss': 'log_loss', 'penalty': 'l2'} =&amp;gt; 88.75%
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧪 Final Accuracy after Retraining
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn77fxf9rcbbipt5y4fev.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn77fxf9rcbbipt5y4fev.png" alt=" " width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Stacking Ensemble
&lt;/h2&gt;

&lt;p&gt;To boost performance further, I implemented &lt;strong&gt;stacking&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base Models&lt;/strong&gt;: SGD, LogisticRegression, LinearSVC&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meta Model&lt;/strong&gt;: LogisticRegression&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;📈 &lt;strong&gt;Stacking Accuracy&lt;/strong&gt;: &lt;code&gt;89.12%&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi64c1bzr1xwf9vkf02po.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi64c1bzr1xwf9vkf02po.png" alt=" " width="800" height="531"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧪 Tried Deep Stacking
&lt;/h2&gt;

&lt;p&gt;Tried multi-layer stacking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Layer 1&lt;/strong&gt;: Logistic, LinearSVC, MultinomialNB, SGD&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layer 2&lt;/strong&gt;: LogisticRegression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layer 3&lt;/strong&gt;: RidgeClassifier, SGD → Final &lt;strong&gt;VotingClassifier&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But… accuracy slightly dropped:&lt;/p&gt;

&lt;p&gt;📉 &lt;strong&gt;Accuracy&lt;/strong&gt;: &lt;code&gt;89.09%&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphx214zayudi3cpifgn0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphx214zayudi3cpifgn0.png" alt=" " width="800" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🔁 I reverted to the &lt;strong&gt;1-layer stacking&lt;/strong&gt;, which performed best.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔧 Tools &amp;amp; Libraries Used
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python&lt;/li&gt;
&lt;li&gt;Scikit-learn&lt;/li&gt;
&lt;li&gt;Pandas&lt;/li&gt;
&lt;li&gt;NLTK&lt;/li&gt;
&lt;li&gt;Matplotlib / Seaborn&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎯 Key Learnings
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Traditional ML can still compete with deep learning in text tasks&lt;/li&gt;
&lt;li&gt;Logistic Regression + TF-IDF = surprisingly powerful&lt;/li&gt;
&lt;li&gt;Ensemble methods like stacking can push the limits&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📝 Final Words
&lt;/h2&gt;

&lt;p&gt;This project wasn't about beating deep learning. It was about &lt;strong&gt;challenging assumptions&lt;/strong&gt; — and proving that, with the right setup, classic ML still holds its ground.&lt;/p&gt;

&lt;p&gt;If you're new to ML or want to understand the fundamentals before diving into deep models, &lt;strong&gt;this path is for you&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Feel free to connect, share thoughts, or collaborate. I’d love to hear your feedback!&lt;/p&gt;

&lt;p&gt;repo link : &lt;a href="https://github.com/anikchand461/sentiment-analysis" rel="noopener noreferrer"&gt;https://github.com/anikchand461/sentiment-analysis&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
      <category>ai</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>The Curve That Judges Your ML Model</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Sun, 13 Jul 2025 14:34:01 +0000</pubDate>
      <link>https://forem.com/anikchand461/understanding-the-auc-roc-curve-in-machine-learning-with-python-code-3nlc</link>
      <guid>https://forem.com/anikchand461/understanding-the-auc-roc-curve-in-machine-learning-with-python-code-3nlc</guid>
      <description>&lt;h3&gt;
  
  
  Ever built a model and felt proud of its 95% accuracy, only to find out it’s not that great after all? 😅
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;I used to think the AUC-ROC curve was some complicated graph that only expert data scientists talked about. But once I understood it, I realized it’s actually pretty simple — and super useful!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this blog, I’ll explain the ROC curve in a way that’s easy to understand. We’ll see how it helps you figure out how good your model really is — with simple examples, pictures, and Python code.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔍 Why Accuracy Isn’t Always the Hero
&lt;/h2&gt;

&lt;p&gt;Let’s say you built a model to detect a rare disease. 99 out of 100 people don’t have it.&lt;/p&gt;

&lt;p&gt;Now imagine your model just predicts “No disease” for everyone.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Accuracy? 99%&lt;/li&gt;
&lt;li&gt;Helpful? Not at all. You missed the one person who actually has the disease.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where smarter metrics come in — things like Precision, Recall, and the star of today’s show: &lt;strong&gt;AUC-ROC&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  ✨ So, What’s This ROC Curve Anyway?
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;ROC (Receiver Operating Characteristic)&lt;/strong&gt; curve shows how good your model is at distinguishing between two classes — like spam vs. not spam.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;X-Axis:&lt;/strong&gt; False Positive Rate (FPR) — how often the model cries wolf&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Y-Axis:&lt;/strong&gt; True Positive Rate (TPR) — how often it catches the real deal&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As you move the decision threshold, these rates change. The ROC curve just plots these changes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhvkfhw3un46e6n23lszl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhvkfhw3un46e6n23lszl.png" alt=" " width="800" height="661"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the curve hugs the top-left corner — you’ve got a great model. If it sticks to the diagonal? Might as well toss a coin.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  📝 A Real-Life Example (Email Spam Classifier)
&lt;/h2&gt;

&lt;p&gt;Suppose you built a model to detect spam. It gives probabilities like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Email    Prob(Spam)
A        0.45
B        0.29
C        0.61
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s say your threshold is 0.5:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&amp;gt; 0.5 → Spam&lt;/li&gt;
&lt;li&gt;≤ 0.5 → Not spam&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🎯 Picking the right threshold matters. Why?&lt;/p&gt;

&lt;p&gt;Because there are two types of mistakes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Predicting not spam when it is spam → ⚠️ You miss an actual threat.&lt;/li&gt;
&lt;li&gt;Predicting spam when it’s not → 🤷 Just an annoying false alarm.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Depending on your use case, one error might be worse than the other.&lt;/p&gt;




&lt;h2&gt;
  
  
  📊 Quick Refresher: Confusion Matrix
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;             Predicted
            1       0
Actual  1   TP      FN
        0   FP      TN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TP&lt;/strong&gt; = True Positive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FN&lt;/strong&gt; = False Negative&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FP&lt;/strong&gt; = False Positive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TN&lt;/strong&gt; = True Negative&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From this we get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TPR&lt;/strong&gt; = TP / (TP + FN) — how many real positives you caught&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FPR&lt;/strong&gt; = FP / (FP + TN) — how many times you cried wolf&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📉 Threshold Changes — What Happens?
&lt;/h2&gt;

&lt;p&gt;Changing your threshold is like adjusting the sensitivity of your spam filter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lower threshold → You catch more spam (high TPR), but mislabel legit emails too (high FPR)&lt;/li&gt;
&lt;li&gt;Higher threshold → Fewer false alarms, but you might miss actual spam&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The ROC curve shows this trade-off for every possible threshold.&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 Spam Detection in Action
&lt;/h2&gt;

&lt;p&gt;Let’s say:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have 200 emails: 100 spam, 100 not spam&lt;/li&gt;
&lt;li&gt;Your model detects 80 spam correctly → TPR = 80%&lt;/li&gt;
&lt;li&gt;But it wrongly flags 20 legit emails → FPR = 20%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🎯 Goal: Keep TPR high and FPR low.&lt;/p&gt;

&lt;p&gt;Another fun one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Netflix Churn Prediction&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;You predict who will cancel their subscription&lt;/li&gt;
&lt;li&gt;False positive = predicting a loyal user will leave → Not great for business&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  📈 How to Read a ROC Curve
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Y-axis:&lt;/strong&gt; TPR (catching the good stuff)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X-axis:&lt;/strong&gt; FPR (making false calls)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As you tweak the threshold:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low threshold → High TPR and high FPR&lt;/li&gt;
&lt;li&gt;High threshold → Low TPR and low FPR&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We want the sweet spot where TPR is high and FPR is low.&lt;/p&gt;




&lt;h2&gt;
  
  
  ✨ AUC: The Area Under That Curve
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;AUC (Area Under the ROC Curve)&lt;/strong&gt; tells us how good your model is overall — across all thresholds.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AUC = 1.0 → Perfect model&lt;/li&gt;
&lt;li&gt;AUC = 0.5 → Random guess&lt;/li&gt;
&lt;li&gt;AUC &amp;lt; 0.5 → Your model might be predicting backwards 😅&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AUC basically says: &lt;em&gt;Pick any random spam and non-spam email — what’s the chance the model ranks the spam one higher?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fze1as6nd7pz1hm0hxw6e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fze1as6nd7pz1hm0hxw6e.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Model M1: AUC = 0.85&lt;/li&gt;
&lt;li&gt;Model M2: AUC = 0.70&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;→ M1 is clearly better at separating spam from not spam.&lt;/p&gt;




&lt;h2&gt;
  
  
  👨‍💻 Try It in Python
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;make_classification&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.model_selection&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;train_test_split&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.ensemble&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RandomForestClassifier&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.metrics&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;roc_curve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;auc&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="c1"&gt;# Dummy data
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;make_classification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_classes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Train model
&lt;/span&gt;&lt;span class="n"&gt;clf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RandomForestClassifier&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Probabilities
&lt;/span&gt;&lt;span class="n"&gt;y_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict_proba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)[:,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# ROC stuff
&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;roc_curve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_probs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;roc_auc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;auc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Plot
&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tpr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;AUC = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;roc_auc&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;--&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;False Positive Rate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;True Positive Rate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ROC Curve&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;grid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔖 One-Liner to Get AUC
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.metrics&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;roc_auc_score&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AUC:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;roc_auc_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_probs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧠 Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Accuracy isn’t always enough&lt;/li&gt;
&lt;li&gt;ROC curve helps &lt;strong&gt;visualize&lt;/strong&gt; your classifier’s skill&lt;/li&gt;
&lt;li&gt;AUC gives an &lt;strong&gt;overall score&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Thresholds change how sensitive your model is&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚀 Wrap-Up
&lt;/h2&gt;

&lt;p&gt;AUC-ROC isn’t just a fancy graph — it helps you really &lt;strong&gt;understand&lt;/strong&gt; your model. Whether you’re filtering spam, detecting diseases, or predicting churn — this curve has your back.&lt;/p&gt;

&lt;p&gt;So next time someone mentions AUC, you can nod, smile, and maybe even draw it too. 😉&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If this helped you, follow me on &lt;a href="https://github.com/anikchand461" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/anik-chand-3b14b12b6/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for more ML breakdowns!&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;#machinelearning #datascience #python #roc #auc #classification&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🚢 OneHotEncoder Shape Mismatch Mystery in Titanic Dataset — Solved!</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Mon, 07 Apr 2025 12:48:13 +0000</pubDate>
      <link>https://forem.com/anikchand461/onehotencoder-shape-mismatch-mystery-in-titanic-dataset-solved-3n6</link>
      <guid>https://forem.com/anikchand461/onehotencoder-shape-mismatch-mystery-in-titanic-dataset-solved-3n6</guid>
      <description>&lt;p&gt;Hi everyone! 👋 I'm currently working through the Titanic dataset as part of the &lt;strong&gt;CampusX YouTube course&lt;/strong&gt;, and I ran into an interesting issue involving &lt;code&gt;OneHotEncoder&lt;/code&gt; and &lt;code&gt;SimpleImputer&lt;/code&gt; that I &lt;em&gt;finally&lt;/em&gt; understood after digging into the problem.&lt;/p&gt;

&lt;p&gt;This blog is all about that journey — what caused the shape mismatch between training and testing data, and how I fixed it. If you're also working on preprocessing categorical variables in machine learning, this might save you a few hours of debugging!&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 The Setup
&lt;/h2&gt;

&lt;p&gt;We’re using the Titanic dataset for classification (predicting survival), and like most people, I’m preprocessing the &lt;code&gt;Sex&lt;/code&gt; and &lt;code&gt;Embarked&lt;/code&gt; columns using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SimpleImputer&lt;/strong&gt; to handle missing values
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OneHotEncoder&lt;/strong&gt; to convert categorical variables into numerical format&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s a snippet of what I had:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.impute&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SimpleImputer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.preprocessing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OneHotEncoder&lt;/span&gt;

&lt;span class="n"&gt;si_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SimpleImputer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;most_frequent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ohe_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OneHotEncoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sparse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handle_unknown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ignore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Imputation
&lt;/span&gt;&lt;span class="n"&gt;x_train_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;si_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;span class="n"&gt;x_test_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;si_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_test&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;

&lt;span class="c1"&gt;# Encoding
&lt;/span&gt;&lt;span class="n"&gt;x_train_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ohe_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;span class="n"&gt;x_test_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ohe_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_test&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚠️ The Issue
&lt;/h2&gt;

&lt;p&gt;After running this, I checked the shapes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x_train_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;  &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;712&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;x_test_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;   &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;179&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait — what?!&lt;br&gt;&lt;br&gt;
Why does the train set have 4 columns, and the test set has only 3, even though I used &lt;code&gt;handle_unknown='ignore'&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;Wasn’t that supposed to handle unknown categories safely?&lt;/p&gt;


&lt;h2&gt;
  
  
  🕵️‍♂️ Investigating the Root Cause
&lt;/h2&gt;

&lt;p&gt;I ran a few more checks and realized something sneaky:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;isnull&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Output: 2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hmm… that’s weird. I &lt;strong&gt;thought&lt;/strong&gt; I had already imputed missing values. But then I remembered this part of my code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x_train_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;si_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Aha! 💡 I had imputed missing values into a &lt;em&gt;new variable&lt;/em&gt;, &lt;code&gt;x_train_embarked&lt;/code&gt;, but &lt;strong&gt;I never updated the original &lt;code&gt;x_train&lt;/code&gt; DataFrame&lt;/strong&gt;!&lt;/p&gt;

&lt;p&gt;That means the original &lt;code&gt;x_train['Embarked']&lt;/code&gt; still had &lt;code&gt;NaN&lt;/code&gt; values when I called &lt;code&gt;.fit()&lt;/code&gt; on the encoder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ohe_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This caused the encoder to treat &lt;strong&gt;&lt;code&gt;NaN&lt;/code&gt; as a valid category&lt;/strong&gt;, resulting in 4 categories being learned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;C&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Q&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NaN&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But in the test data, there were &lt;strong&gt;no NaN values&lt;/strong&gt;, only:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;C&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Q&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the encoder &lt;strong&gt;ignored the unseen &lt;code&gt;NaN&lt;/code&gt; category&lt;/strong&gt;, resulting in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x_test_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;179&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ✅ The Fix
&lt;/h2&gt;

&lt;p&gt;The correct way was to assign the imputed values &lt;em&gt;back&lt;/em&gt; to the original DataFrame:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;si_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;span class="n"&gt;x_test&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;si_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_test&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, when I fit the encoder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ohe_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OneHotEncoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sparse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handle_unknown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ignore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;x_train_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ohe_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_train&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;span class="n"&gt;x_test_embarked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ohe_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_test&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Embarked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✅ The shapes finally matched:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;x_train_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;712&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;x_test_embarked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;179&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  📌 Bonus: What &lt;code&gt;handle_unknown='ignore'&lt;/code&gt; Really Means
&lt;/h2&gt;

&lt;p&gt;Here’s a quick visual explanation:&lt;/p&gt;

&lt;p&gt;Imagine your training data had these categories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;['Red', 'Blue', 'Green']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And you encode them like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Red&lt;/th&gt;
&lt;th&gt;Blue&lt;/th&gt;
&lt;th&gt;Green&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Now your test data contains a &lt;strong&gt;new&lt;/strong&gt; category: &lt;code&gt;'Yellow'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;OneHotEncoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle_unknown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ignore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the encoder will just assign all 0s for &lt;code&gt;'Yellow'&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Red&lt;/th&gt;
&lt;th&gt;Blue&lt;/th&gt;
&lt;th&gt;Green&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;✅ No crash. But also — you now have a row of all zeros!&lt;/p&gt;




&lt;h2&gt;
  
  
  🎓 Final Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Always &lt;strong&gt;handle missing values before encoding&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;If you’re using &lt;code&gt;SimpleImputer&lt;/code&gt;, &lt;strong&gt;assign the output back&lt;/strong&gt; to your original DataFrame&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;handle_unknown='ignore'&lt;/code&gt; prevents errors, but doesn’t fix shape mismatches caused by unseen categories during &lt;code&gt;.fit()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;This was a great learning moment for me while working through the Titanic dataset with CampusX. Hope this helps anyone else facing the same mystery! 🧩&lt;/p&gt;

&lt;p&gt;Let me know if you've run into similar preprocessing surprises!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Regression in ML Explained! 🚀 The Ultimate Hands-on Guide</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Wed, 19 Mar 2025 11:53:54 +0000</pubDate>
      <link>https://forem.com/anikchand461/regression-in-ml-explained-the-ultimate-hands-on-guide-484f</link>
      <guid>https://forem.com/anikchand461/regression-in-ml-explained-the-ultimate-hands-on-guide-484f</guid>
      <description>&lt;h2&gt;
  
  
  💡 How does Netflix know what you’ll binge-watch next? Or how do businesses predict future sales with impressive accuracy?
&lt;/h2&gt;

&lt;p&gt;The magic behind these predictions is &lt;strong&gt;Regression&lt;/strong&gt;—a fundamental technique in Machine Learning! 🚀  &lt;/p&gt;

&lt;p&gt;Whether it's forecasting &lt;strong&gt;house prices 🏡, stock trends 📈, or weather patterns 🌦️&lt;/strong&gt;, regression plays a crucial role in making data-driven decisions. In this guide, we’ll break it all down—step by step—with easy explanations, real-world examples, and hands-on code.  &lt;/p&gt;

&lt;h3&gt;
  
  
  🔍 What’s in store for you?
&lt;/h3&gt;

&lt;p&gt;We'll explore various &lt;strong&gt;Regression algorithms&lt;/strong&gt;, understand how they work, and see them in action with practical applications. Let’s dive in! 🔥&lt;/p&gt;

&lt;h1&gt;
  
  
  💡 1. Linear Regression: The Foundation of Predictive Modeling
&lt;/h1&gt;

&lt;p&gt;Linear Regression is the most fundamental regression technique, assuming a &lt;strong&gt;straight-line relationship&lt;/strong&gt; between input variables (X) and the output (Y). It is widely used for predicting trends, making forecasts, and understanding relationships between variables.  &lt;/p&gt;

&lt;p&gt;By fitting a linear equation to the observed data, Linear Regression helps in estimating the dependent variable based on independent variables. The equation of a simple linear regression is:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwq3x2yvzbhiy7piderxv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwq3x2yvzbhiy7piderxv.png" alt=" " width="600" height="95"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  📌 Where:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Y&lt;/strong&gt; = Predicted value (dependent variable)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X&lt;/strong&gt; = Input feature (independent variable)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;b₀&lt;/strong&gt; = Intercept (constant term)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;b₁&lt;/strong&gt; = Slope (coefficient of X)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ε&lt;/strong&gt; = Error term
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔹 Key Applications of Linear Regression:
&lt;/h2&gt;

&lt;p&gt;✅ &lt;strong&gt;Stock Market Predictions&lt;/strong&gt; 📈&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Sales Forecasting&lt;/strong&gt; 🛍️&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Real Estate Price Estimation&lt;/strong&gt; 🏡&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Medical Research &amp;amp; Risk Analysis&lt;/strong&gt; ⚕️  &lt;/p&gt;


&lt;h2&gt;
  
  
  🖥️ Implementing Linear Regression in Python:
&lt;/h2&gt;

&lt;p&gt;Let's implement &lt;strong&gt;Simple Linear Regression&lt;/strong&gt; using Python and &lt;strong&gt;Scikit-Learn&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.model_selection&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;train_test_split&lt;/span&gt;

&lt;span class="c1"&gt;# Sample dataset
&lt;/span&gt;&lt;span class="n"&gt;data_X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;data_Y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Splitting the data
&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Y_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Y_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_Y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Model training
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Predictions
&lt;/span&gt;&lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Plotting the regression line
&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data_X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_Y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;blue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Actual Data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;red&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;linewidth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Regression Line&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Input Feature (X)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Output (Y)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Linear Regression Model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  📊 Output Visualization:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjtfndrrktk5w2p3x137d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjtfndrrktk5w2p3x137d.png" alt="Linear Regression Plot" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This simple example demonstrates how &lt;strong&gt;Linear Regression&lt;/strong&gt; can be implemented using &lt;strong&gt;Scikit-Learn&lt;/strong&gt; in Python. 🚀&lt;br&gt;&lt;br&gt;
Stay tuned as we explore more regression techniques in the next sections! 🔥&lt;/p&gt;




&lt;h3&gt;
  
  
  🔎 &lt;strong&gt;Example Use Case:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;📌 &lt;strong&gt;Predicting house prices based on square footage&lt;/strong&gt; 🏠&lt;br&gt;&lt;br&gt;
Imagine you have a dataset with house sizes and their respective prices. By applying &lt;strong&gt;Linear Regression&lt;/strong&gt;, you can predict the price of a house based on its area!&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;📢 &lt;strong&gt;Tip:&lt;/strong&gt; Always check model assumptions like linearity, independence, and normal distribution of residuals before applying Linear Regression in real-world scenarios.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let’s move on to more advanced regression techniques in the next section! 🚀&lt;/p&gt;







&lt;h2&gt;
  
  
  🚀 &lt;strong&gt;2. Multiple Linear Regression: Expanding Predictive Power&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Multiple Linear Regression extends &lt;strong&gt;Simple Linear Regression&lt;/strong&gt; by incorporating multiple input variables to predict an outcome. Instead of modeling a relationship between just one independent variable and the dependent variable, it considers &lt;strong&gt;two or more independent variables&lt;/strong&gt;, making predictions more accurate.  &lt;/p&gt;

&lt;h3&gt;
  
  
  🔍 &lt;strong&gt;Understanding Multiple Linear Regression&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In Multiple Linear Regression, the relationship between the dependent variable (&lt;strong&gt;Y&lt;/strong&gt;) and multiple independent variables (&lt;strong&gt;X₁, X₂, X₃, ... Xₙ&lt;/strong&gt;) is represented as:  &lt;/p&gt;

&lt;h3&gt;
  
  
  📏 &lt;strong&gt;Equation of Multiple Linear Regression:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbnq75zdur49nnfo1zuj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbnq75zdur49nnfo1zuj.png" alt=" " width="576" height="65"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Where:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Y&lt;/strong&gt; = Dependent variable (what we predict)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X₁, X₂, X₃, ... Xₙ&lt;/strong&gt; = Independent variables (input features)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;b₀&lt;/strong&gt; = Intercept (constant term)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;b₁, b₂, ..., bₙ&lt;/strong&gt; = Coefficients representing the influence of each variable
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ε&lt;/strong&gt; = Error term
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  📊 &lt;strong&gt;Visual Representation:&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;1️⃣ Concept of Multiple Regression&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm67fhowrh17m5czvlrf6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm67fhowrh17m5czvlrf6.png" alt="Multiple Regression Concept" width="800" height="573"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;2️⃣ Regression Plane Representation (for 2 Variables)&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7fqkqjlzikkhirlwzg5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7fqkqjlzikkhirlwzg5.png" alt="Regression Plane" width="800" height="382"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;3️⃣ Multiple Linear Regression Formula Breakdown&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclomzxjk4rsceyh8w8uc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclomzxjk4rsceyh8w8uc.png" alt="Regression Equation" width="800" height="406"&gt;&lt;/a&gt;  &lt;/p&gt;




&lt;h3&gt;
  
  
  🖥️ &lt;strong&gt;Code Implementation: Mean Squared Error (MSE) in Python&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mean_squared_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_actual&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Compute the Mean Squared Error (MSE) cost function.

    Parameters:
    y_actual : np.array : Actual values
    y_pred : np.array : Predicted values (mx + c)

    Returns:
    float : MSE value
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_actual&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Number of data points
&lt;/span&gt;    &lt;span class="n"&gt;mse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;y_actual&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mse&lt;/span&gt;

&lt;span class="c1"&gt;# Example Data
&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# Input features
&lt;/span&gt;&lt;span class="n"&gt;y_actual&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# Actual output values
&lt;/span&gt;
&lt;span class="c1"&gt;# Linear regression parameters
&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;  &lt;span class="c1"&gt;# Slope
&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;  &lt;span class="c1"&gt;# Intercept
&lt;/span&gt;
&lt;span class="c1"&gt;# Compute predictions
&lt;/span&gt;&lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;

&lt;span class="c1"&gt;# Compute MSE
&lt;/span&gt;&lt;span class="n"&gt;mse_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;mean_squared_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_actual&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mean Squared Error (MSE):&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mse_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🏠 &lt;strong&gt;Example Use Case: Predicting House Prices&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Features considered:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;X₁:&lt;/strong&gt; Size of the house (sq ft)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X₂:&lt;/strong&gt; Number of bedrooms
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X₃:&lt;/strong&gt; Location rating
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Y:&lt;/strong&gt; Predicted house price
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✅ &lt;strong&gt;Advantages of Multiple Linear Regression:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;✔️ Captures the effect of multiple variables for better predictions.&lt;br&gt;&lt;br&gt;
✔️ Useful for complex real-world scenarios like &lt;strong&gt;finance, healthcare, and business analytics&lt;/strong&gt;.  &lt;/p&gt;

&lt;h3&gt;
  
  
  ❌ &lt;strong&gt;Challenges of Multiple Linear Regression:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;⚠️ More features increase &lt;strong&gt;complexity and overfitting risks&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
⚠️ Requires &lt;strong&gt;careful feature selection and normalization&lt;/strong&gt; for accuracy.  &lt;/p&gt;







&lt;h2&gt;
  
  
  🚀 3. Polynomial Regression: Capturing Non-Linear Trends
&lt;/h2&gt;

&lt;p&gt;When data doesn’t follow a straight-line trend, &lt;strong&gt;Polynomial Regression&lt;/strong&gt; helps model &lt;strong&gt;non-linear relationships&lt;/strong&gt; by introducing polynomial terms to the equation. This technique is useful when the relationship between the independent and dependent variables is curved.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ubrs633zmoglhcopkr4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ubrs633zmoglhcopkr4.png" alt="Polynomial Regression Visualization" width="800" height="588"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  📌 &lt;strong&gt;Equation:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Polynomial Regression extends Linear Regression by incorporating higher-degree polynomial terms:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Far9svebb2zfkx5xi2dqd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Far9svebb2zfkx5xi2dqd.png" alt="Polynomial Equation" width="800" height="107"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Y&lt;/strong&gt; is the predicted output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X&lt;/strong&gt; is the input feature&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;b₀, b₁, b₂, …, bₙ&lt;/strong&gt; are the regression coefficients&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;n&lt;/strong&gt; is the polynomial degree&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ε&lt;/strong&gt; is the error term&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔍 &lt;strong&gt;Real-World Applications of Polynomial Regression:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;📈 &lt;strong&gt;Salary Prediction:&lt;/strong&gt; Estimating salary growth over time, where experience influences salary in a non-linear fashion.&lt;/li&gt;
&lt;li&gt;🦠 &lt;strong&gt;COVID-19 Trend Forecasting:&lt;/strong&gt; Modeling infection rate trends, which often follow polynomial or exponential growth.&lt;/li&gt;
&lt;li&gt;🚗 &lt;strong&gt;Vehicle Performance Modeling:&lt;/strong&gt; Predicting fuel consumption based on speed and engine performance.&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;Economics &amp;amp; Finance:&lt;/strong&gt; Forecasting demand, inflation, and economic trends where relationships are complex.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✅ &lt;strong&gt;Advantages:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;✔️ Works well for &lt;strong&gt;curved datasets&lt;/strong&gt; where Linear Regression fails.&lt;br&gt;&lt;br&gt;
✔️ Provides a &lt;strong&gt;better fit&lt;/strong&gt; for non-linear trends when the correct degree is chosen.  &lt;/p&gt;

&lt;h3&gt;
  
  
  ❌ &lt;strong&gt;Disadvantages:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;❌ Can &lt;strong&gt;overfit&lt;/strong&gt; the data if the polynomial degree is too high.&lt;br&gt;&lt;br&gt;
❌ Harder to interpret compared to simple Linear Regression.  &lt;/p&gt;

&lt;h3&gt;
  
  
  🖥️ &lt;strong&gt;Python Code for Polynomial Regression:&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.preprocessing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PolynomialFeatures&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.pipeline&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;make_pipeline&lt;/span&gt;

&lt;span class="c1"&gt;# Sample dataset
&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;75&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;105&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;140&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;180&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Creating a polynomial model (degree = 2)
&lt;/span&gt;&lt;span class="n"&gt;poly_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;make_pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;PolynomialFeatures&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;degree&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;poly_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;poly_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Plot results
&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;blue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Actual Data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;red&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;linewidth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Polynomial Regression Line&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Input Feature (X)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Output (Y)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polynomial Regression Model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  📌 &lt;strong&gt;Visual Representation:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq51430kehsuhhssyh67.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbq51430kehsuhhssyh67.png" alt="Polynomial Regression Curve" width="800" height="314"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Polynomial Regression allows machine learning models to capture &lt;strong&gt;non-linear relationships&lt;/strong&gt; and make better predictions in real-world scenarios. 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;4. Logistic Regression (For Classification) :&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Although it contains &lt;strong&gt;"Regression"&lt;/strong&gt; in its name, Logistic Regression is used for &lt;strong&gt;Classification problems, not Regression.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of predicting continuous values, it predicts probabilities and assigns categories like &lt;strong&gt;Yes/No, Pass/Fail, Spam/Not Spam&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkdmlduu1mo8usd1311ac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkdmlduu1mo8usd1311ac.png" alt="Logistic Regression" width="800" height="659"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;&lt;em&gt;Equation:&lt;/em&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fei7kkwuttp2a5ite4ca0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fei7kkwuttp2a5ite4ca0.png" alt="Equation" width="762" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;where &lt;strong&gt;P&lt;/strong&gt; is the &lt;strong&gt;probability&lt;/strong&gt; of belonging to a class.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;✅ Example:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Predicting &lt;strong&gt;whether a customer will buy a product (Yes/No).&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Classifying &lt;strong&gt;emails as spam or not.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;✅ Why is it called Regression?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Although it’s used for classification, &lt;strong&gt;Logistic Regression&lt;/strong&gt; applies a regression-based approach before applying the &lt;strong&gt;Sigmoid function&lt;/strong&gt; to convert outputs into probabilities.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;🖥️ Python Implementation of Logistic Regression&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.model_selection&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;train_test_split&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LogisticRegression&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.metrics&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;accuracy_score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classification_report&lt;/span&gt;

&lt;span class="c1"&gt;# Sample dataset (Binary classification: Pass (1) or Fail (0))
&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([[&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;55&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;65&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;  &lt;span class="c1"&gt;# Hours studied
&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# 0 = Fail, 1 = Pass
&lt;/span&gt;
&lt;span class="c1"&gt;# Splitting dataset
&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Training Logistic Regression model
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LogisticRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Predictions
&lt;/span&gt;&lt;span class="n"&gt;y_pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Evaluating model
&lt;/span&gt;&lt;span class="n"&gt;accuracy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;accuracy_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Accuracy:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;accuracy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Classification Report:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;classification_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_pred&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Plotting the sigmoid curve
&lt;/span&gt;&lt;span class="n"&gt;X_range&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;linspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;y_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict_proba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_range&lt;/span&gt;&lt;span class="p"&gt;)[:,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;blue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Actual Data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_range&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;red&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Sigmoid Curve&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hours Studied&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Probability of Passing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Logistic Regression Model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;legend&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;📌 Expected Output:&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Accuracy: 1.0  # (Might vary slightly depending on random split)
Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00         1
           1       1.00      1.00      1.00         1

    accuracy                           1.00         2
   macro avg       1.00      1.00      1.00         2
weighted avg       1.00      1.00      1.00         2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implementation demonstrates how &lt;strong&gt;Logistic Regression&lt;/strong&gt; is used for binary classification. The model predicts whether a student will &lt;strong&gt;pass or fail&lt;/strong&gt; based on study hours, and we visualize the &lt;strong&gt;sigmoid function curve&lt;/strong&gt;. 📊🔥&lt;/p&gt;







&lt;h1&gt;
  
  
  &lt;strong&gt;📌 Conclusion: Regression in Machine Learning&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Regression is a fundamental concept in &lt;strong&gt;Machine Learning&lt;/strong&gt;, enabling us to make &lt;strong&gt;continuous predictions&lt;/strong&gt; based on input features. It is widely used in forecasting, trend analysis, and data-driven decision-making.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔹 &lt;strong&gt;Quick Summary of Regression Algorithms&lt;/strong&gt;
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Algorithm&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Use Case&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Equation Type&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Linear Regression&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Predicting sales, stock prices&lt;/td&gt;
&lt;td&gt;Linear equation&lt;/td&gt;
&lt;td&gt;Simple relationships between variables&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multiple Regression&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;House pricing with multiple factors&lt;/td&gt;
&lt;td&gt;Linear (Multiple Inputs)&lt;/td&gt;
&lt;td&gt;Impact of multiple features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Polynomial Regression&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Salary growth trends, COVID-19 cases&lt;/td&gt;
&lt;td&gt;Polynomial equation&lt;/td&gt;
&lt;td&gt;Capturing non-linear patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Logistic Regression&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Spam detection, customer conversion&lt;/td&gt;
&lt;td&gt;Sigmoid function&lt;/td&gt;
&lt;td&gt;Classification problems&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  🏆 &lt;strong&gt;Key Takeaways&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;✅ Regression is essential for predictive modeling in &lt;strong&gt;real-world applications&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
✅ Choosing the right regression technique depends on &lt;strong&gt;data patterns and relationships&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
✅ Logistic Regression is used for &lt;strong&gt;classification&lt;/strong&gt;, despite its name.  &lt;/p&gt;

&lt;p&gt;Regression models &lt;strong&gt;power AI-driven decision-making&lt;/strong&gt;, forming the backbone of modern analytics and forecasting! 🚀&lt;/p&gt;

</description>
      <category>python</category>
      <category>datascience</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>🎮 Crafting Fun with Code: My Journey Building the Hangman Game 🕹️</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Tue, 07 Jan 2025 14:14:48 +0000</pubDate>
      <link>https://forem.com/anikchand461/crafting-fun-with-code-my-journey-building-the-hangman-game-11ih</link>
      <guid>https://forem.com/anikchand461/crafting-fun-with-code-my-journey-building-the-hangman-game-11ih</guid>
      <description>&lt;p&gt;Repo link :  &lt;a href="https://github.com/anikchand461/Hangman-Game" rel="noopener noreferrer"&gt;https://github.com/anikchand461/Hangman-Game&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What motivated me to start blogging ?&lt;/strong&gt;&lt;br&gt;
    This is my first blog, and the motivation to start blogging came from my friend, Abhiraj Adhikary. He encouraged me to explore blogging as a way to connect with like-minded people, share knowledge, and engage with a broader community. His insights made me realize that blogging isn’t just about documenting ideas but also a way to clarify and refine my own concepts, particularly in projects. Writing about my work helps me dive deeper into the details, reflect on my learning, and present it in a way that others can benefit from. Inspired by his advice, I decided to start this journey to share my experiences and build meaningful connections with others in the tech and learning community.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is hangman Game ?&lt;/strong&gt;&lt;br&gt;
     Hangman is a classic word-guessing game with a simple yet engaging objective: players guess a hidden word by suggesting letters within a limited number of attempts. With each incorrect guess, a part of a stick-figure “hangman” is drawn, increasing the tension. The game tests vocabulary, problem-solving skills, and strategy, making it a popular choice for both casual play and learning activities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What inspired me to create this game ?&lt;/strong&gt;&lt;br&gt;
     I was learning Python through Jenny’s Lecture CS IT YouTube channel, where the teacher introduced the Hangman game as a project. However, I decided to take a different approach—I wanted to challenge myself to create the game independently, without watching the tutorial. To start, I played a few Hangman games from apps downloaded from the Play Store, which helped me understand the gameplay and logic. These experiences inspired me to incorporate small changes and enhancements into my version of the game, aiming for a more polished and engaging result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Objective of the game&lt;/strong&gt;&lt;br&gt;
    The main goal of creating my Hangman game was twofold. It was a fun project that allowed players to enjoy a word-guessing game with categories like sports, food, and animals. For me, it was also a valuable learning experience, reinforcing my understanding of Python concepts like conditionals, loops, and the random module for dynamic word selection. Building the game enhanced my logic-building skills, as I designed the gameplay flow, added ASCII art, and implemented features like replay options. This project was a perfect mix of fun and learning, marking an important step in my Python journey.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Features of the Game&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Diverse Word Categories&lt;/strong&gt;&lt;br&gt;
The game includes multiple word categories, such as sports, food, and animals, offering players a varied and engaging experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic Word Selection&lt;/strong&gt;&lt;br&gt;
With the use of Python’s random module, words are dynamically selected from the chosen category, ensuring unpredictability and replayability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Engaging ASCII Art&lt;/strong&gt;&lt;br&gt;
The game features creative ASCII art for the Hangman and the game header, adding a visual element that enhances the player’s experience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7si25jfcg2vlfix636vb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7si25jfcg2vlfix636vb.png" alt=" " width="800" height="560"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Replay Option&lt;/strong&gt;&lt;br&gt;
Players have the option to replay the game after a round ends, making it easy to enjoy multiple sessions without restarting the program.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcvrukjbmdx7el2js41nz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcvrukjbmdx7el2js41nz.png" alt=" " width="800" height="46"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User-Friendly Enhancements&lt;/strong&gt;&lt;br&gt;
To improve the gameplay experience, I incorporated user-friendly features such as- &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;Error-Handling&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Ensuring the program runs smoothly even if the player makes invalid inputs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Score-keeping&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Tracking player performance to add a competitive edge and encourage improvement.&lt;/p&gt;


&lt;/blockquote&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  print(f'''
    GAME SUMMARY--
    Player name : {name}
    Difficulty level : {difficulty_level}
    lives used : {6 - life}
    Total attampts : {correct_attampts + wrong_attampts}
    Correct attampts : {correct_attampts}
    Wrong attampts : {wrong_attampts}
    ''')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Technologies Used&lt;/strong&gt;&lt;br&gt;
    Programming Language: Python&lt;br&gt;
The game is built entirely using Python, a versatile and beginner-friendly programming language known for its simplicity and readability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Libraries used&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;

&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import random as r
import os
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;random&lt;/strong&gt; :&lt;br&gt;
 This library is utilized to dynamically select words from predefined categories, ensuring that each game offers a unique experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;os&lt;/strong&gt; :&lt;br&gt;
 The os module is used for tasks like clearing the screen between guesses, enhancing the overall gameplay presentation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Why Python?&lt;/strong&gt;&lt;br&gt;
Python was chosen for this project because I was actively learning Python at the time. I thought building the Hangman game would be a great way to apply and master the concepts I was learning. The simplicity and versatility of Python made it the perfect choice for creating a fun and engaging project while improving my skills.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code Walkthrough&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;First Implementation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I started by creating a flowchart to map out the game’s logic:&lt;br&gt;
  • I chose a predefined word and created a dashed list, with each dash representing a letter of the word.&lt;br&gt;
  • When the player guessed a letter, it was compared to the word. If a match was found, the dash was replaced with the correct letter. This continued until all letters were revealed or the player ran out of attempts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Secondary Improvement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After implementing the basic logic, I enhanced the game by:&lt;br&gt;
  • Allowing players to replace all instances of the guessed letter in the dashed list (e.g., “away” becomes [‘a’, ‘’, ‘a’, ’’] when guessing ‘a’).&lt;/p&gt;


&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    for i in range(letter_numbers):
        list_word.append('_')
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;• Introducing a lives system, where players start with 6 lives, losing 1 life with each incorrect guess. The game ends when the word is guessed correctly or lives run out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further Improvement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I improved the game’s word selection by:&lt;br&gt;
  • Creating a list of 1000 words using ChatGPT and using the random module to select a word.&lt;br&gt;
  • Adding categories like sports, food, and animals, allowing players to choose based on their interests, making the game more engaging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Additional Development&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To make the game more engaging, I added difficulty levels and visual enhancements:&lt;/p&gt;


&lt;pre class="highlight plaintext"&gt;&lt;code&gt;     if difficulty_level == 'easy':
        select_word = select_word1(key)
    elif difficulty_level == 'moderate':
        select_word = select_word2(key)
    elif difficulty_level == 'hard':
        select_word = select_word3(key)
    else:
        print('please enter valid input')
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Difficulty Levels:&lt;br&gt;
  • Introduced three levels: easy (1–5 letters), medium (5–7 letters), and hard (8+ letters).&lt;br&gt;
  • Players could choose a level, and word selection adjusted accordingly for added challenge.&lt;br&gt;
Visual Enhancements:&lt;br&gt;
  • Included ASCII art hangman figures to show progress, with more detail as lives were lost.&lt;/p&gt;


&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    r"""
      +---+
      |   |  
      O   | 
     /|\  | 
     /    | 
          |
    """,
    r"""
      +---+
      |   |  
      O   | 
     /|\  | 
     / \  | 
          | ☠️
    """
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;• Added win and loss ASCII art for a dramatic and satisfying game ending.&lt;br&gt;
These features made the game more dynamic, fun, and visually appealing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final Touches&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To give the game a polished feel, I made key improvements:&lt;br&gt;
  • Added an introductory message with hangman ASCII art.&lt;br&gt;
  • Included player name tracking, lives used, and final results showing the word.&lt;br&gt;
  • Wrapped the game in a while(True) loop for infinite rounds, with an option to replay after each round.&lt;br&gt;
These enhancements made the game more engaging, visually appealing, and fun.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Lessons Learned&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;1. Problem-Solving Skills&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Building the Hangman game was a great way to enhance my problem-solving skills. Debugging errors and optimizing gameplay required critical analysis and logical thinking. Handling edge cases like duplicate guesses, invalid inputs, and word categories taught me to design robust conditions. Each challenge refined my structured problem-solving approach: identifying issues, breaking them down, and systematically testing solutions. This iterative process improved my code and strengthened my ability to handle complex problems in future projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Time Management&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Balancing learning Python while working on this project tested my time management skills. I scheduled time for learning concepts, implementing features, and refining the game. Initially, juggling syntax, debugging, and creative elements like ASCII art felt overwhelming. However, a step-by-step plan and task prioritization kept me organized. I focused on essential features, like basic game logic, before adding enhancements like visuals and error-handling. This disciplined approach improved my efficiency and became a valuable skill in my learning journey.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Enjoy The Game&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrut81wme548imbyz67d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrut81wme548imbyz67d.png" alt=" " width="800" height="472"&gt;&lt;/a&gt;&lt;br&gt;
    Feel free to explore the project on GitHub (github repo link : &lt;a href="https://github.com/anikchand461/Hangman-Game" rel="noopener noreferrer"&gt;https://github.com/anikchand461/Hangman-Game&lt;/a&gt; ), try the game, or share your feedback and ideas. I’d love to hear your thoughts and connect with others who are passionate about coding and learning. Let’s build something amazing together!&lt;/p&gt;

</description>
      <category>programming</category>
      <category>python</category>
    </item>
    <item>
      <title>technical blogs</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Mon, 30 Dec 2024 17:57:10 +0000</pubDate>
      <link>https://forem.com/anikchand461/technical-blogs-nna</link>
      <guid>https://forem.com/anikchand461/technical-blogs-nna</guid>
      <description></description>
      <category>emptystring</category>
    </item>
    <item>
      <title>blog</title>
      <dc:creator>Anik Chand</dc:creator>
      <pubDate>Mon, 30 Dec 2024 17:50:43 +0000</pubDate>
      <link>https://forem.com/anikchand461/blog-1i50</link>
      <guid>https://forem.com/anikchand461/blog-1i50</guid>
      <description></description>
      <category>emptystring</category>
    </item>
  </channel>
</rss>
