<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mustapha Tijani</title>
    <description>The latest articles on Forem by Mustapha Tijani (@themustaphatijani).</description>
    <link>https://forem.com/themustaphatijani</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3565453%2F3fe00f03-7fa8-4080-9b39-0794b5af20fc.jpeg</url>
      <title>Forem: Mustapha Tijani</title>
      <link>https://forem.com/themustaphatijani</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/themustaphatijani"/>
    <language>en</language>
    <item>
      <title>How to Build an AI vs Human Image Detector Using Streamlit &amp; Transformers</title>
      <dc:creator>Mustapha Tijani</dc:creator>
      <pubDate>Tue, 09 Dec 2025 17:00:16 +0000</pubDate>
      <link>https://forem.com/themustaphatijani/how-to-build-an-ai-vs-human-image-detector-using-streamlit-transformers-2a61</link>
      <guid>https://forem.com/themustaphatijani/how-to-build-an-ai-vs-human-image-detector-using-streamlit-transformers-2a61</guid>
      <description>&lt;p&gt;Artificial Intelligence models like &lt;strong&gt;SDXL&lt;/strong&gt;, &lt;strong&gt;Grok&lt;/strong&gt;, &lt;strong&gt;Gemini&lt;/strong&gt;, and others are producing images so realistic that even humans can’t always tell them apart from real photos. As these models get better, traditional detectors become less effective.&lt;/p&gt;

&lt;p&gt;In this guide, I’ll show you how to &lt;strong&gt;build your own AI-vs-Human Image Detector&lt;/strong&gt; using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit&lt;/strong&gt; for the UI&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hugging Face Transformers&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PyTorch&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;A modern detection model: &lt;strong&gt;Organika/sdxl-detector&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Overview of What We’re Building
&lt;/h2&gt;

&lt;p&gt;This detector:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Accepts an uploaded image&lt;/li&gt;
&lt;li&gt;Processes it using a pretrained deep-learning model&lt;/li&gt;
&lt;li&gt;Predicts whether the image is &lt;strong&gt;AI-generated&lt;/strong&gt; or &lt;strong&gt;Human-captured&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Displays the model’s confidence score&lt;/li&gt;
&lt;li&gt;Works on CPU, CUDA, or Apple Silicon (MPS)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The entire stack sits inside a simple Streamlit app that users can run locally or online.&lt;/p&gt;

&lt;h1&gt;
  
  
  Step-by-Step: Let's Get Started
&lt;/h1&gt;

&lt;p&gt;Below, we’ll break down the &lt;strong&gt;important sections&lt;/strong&gt; of the script so you not only use it—but understand &lt;em&gt;why it works&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment Setup and Packages Installation
&lt;/h2&gt;

&lt;p&gt;We need to setup a virtual environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; venv &lt;span class="nb"&gt;env
source env&lt;/span&gt;/bin/activate  &lt;span class="c"&gt;# On Linux/macOS&lt;/span&gt;
&lt;span class="c"&gt;# env\Scripts\activate   # On Windows&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Packages Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Core dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;streamlit pillow torch transformers accelerate

&lt;span class="c"&gt;# Optional: If you encounter errors with an old NumPy version (e.g., NumPy 2.x),&lt;/span&gt;
&lt;span class="c"&gt;# you may need to downgrade it for PyTorch compatibility:&lt;/span&gt;
&lt;span class="c"&gt;# pip install "numpy&amp;lt;2"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  We Will Import Dependencies &amp;amp; Set the Model
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;streamlit&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoImageProcessor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoModelForImageClassification&lt;/span&gt; 

&lt;span class="n"&gt;MODEL_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Organika/sdxl-detector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What’s happening here?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;streamlit&lt;/code&gt; powers the web interface&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Pillow&lt;/code&gt; loads and manipulates the uploaded images&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;torch&lt;/code&gt; handles model execution&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;transformers&lt;/code&gt; loads the Hugging Face model&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MODEL_ID&lt;/code&gt; points to a model optimized for SDXL-level imagery&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Selecting the Compute Device (CPU / MPS)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;backends&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_available&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;device&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;device&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This section ensures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mac users get fast inference via &lt;strong&gt;Apple Silicon (MPS)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Everyone else falls back to &lt;strong&gt;CPU&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Loading the Model &amp;amp; Processor
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;processor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoImageProcessor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForImageClassification&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;torch_dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;device_map&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Image Processor&lt;/strong&gt; converts PIL images into model tensors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model&lt;/strong&gt; is loaded with smart device placement&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;eval()&lt;/code&gt; ensures the model runs in inference mode&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;device_map="auto"&lt;/code&gt; makes Transformers automatically handle multi-device setups.&lt;/p&gt;

&lt;h2&gt;
  
  
  Image Classification Next
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;predict_pil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;images&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;no_grad&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;outputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;logits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logits&lt;/span&gt;
    &lt;span class="n"&gt;probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;pred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;item&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id2label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;probs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Converts the image into model-ready tensors&lt;/li&gt;
&lt;li&gt;Moves them to the correct device&lt;/li&gt;
&lt;li&gt;Runs a forward pass (no gradients)&lt;/li&gt;
&lt;li&gt;Applies softmax to get probabilities&lt;/li&gt;
&lt;li&gt;Returns&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Predicted class label&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Confidence score&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the core part of the detector.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's add the User-Interface Using Streamlit
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_page_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI Image Detector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;centered&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI vs Human Image Detector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Upload an image to detect whether it was generated by an AI model or captured by a human.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Streamlit handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Page layout&lt;/li&gt;
&lt;li&gt;Title + description&lt;/li&gt;
&lt;li&gt;File uploader&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Upload and Display
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;uploaded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;file_uploader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Upload Image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jpeg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;MAX_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;uploaded&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uploaded&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RGB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;thumbnail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MAX_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Uploaded Image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;stretch&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will resize the image for display but still preserve enough detail for classification.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running Prediction &amp;amp; Displaying Results
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;spinner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyzing image...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;predict_pil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We then interpret results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;real&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;human&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;result_style&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Likely Human Captured&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;artificial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;result_style&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Likely AI-Generated&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result_style&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;label&lt;/span&gt;

&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Prediction:** **&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result_style&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;**&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**Confidence:** **&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%**&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We normalize the model’s labels into human-readable categories.&lt;/p&gt;

&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;This project is a great example of how:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streamlit&lt;/strong&gt; can turn any ML model into a usable app within minutes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transformers&lt;/strong&gt; makes loading advanced models extremely simple&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Device-aware code&lt;/strong&gt; ensures reliability across different hardware&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You now have everything needed to build, modify, or extend your own AI detection tools.&lt;/p&gt;

&lt;p&gt;Try the live app&lt;br&gt;
&lt;strong&gt;&lt;a href="https://tj-ai-image-detector.streamlit.app/" rel="noopener noreferrer"&gt;https://tj-ai-image-detector.streamlit.app/&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Get the source code&lt;br&gt;
&lt;strong&gt;&lt;a href="https://github.com/tijanidevit/ai-image-detector" rel="noopener noreferrer"&gt;https://github.com/tijanidevit/ai-image-detector&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Watch Youtube Demo&lt;br&gt;
&lt;strong&gt;&lt;a href="https://youtu.be/4aLgpu5sirA?si=S6B3kXkfRqBl1-P8" rel="noopener noreferrer"&gt;https://youtu.be/4aLgpu5sirA?si=S6B3kXkfRqBl1-P8&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>huggingface</category>
      <category>image</category>
    </item>
    <item>
      <title>The Complete Guide to NLP Text Preprocessing: Tokenization, Normalization, Stemming, Lemmatization, and More</title>
      <dc:creator>Mustapha Tijani</dc:creator>
      <pubDate>Fri, 14 Nov 2025 21:13:33 +0000</pubDate>
      <link>https://forem.com/themustaphatijani/the-complete-guide-to-nlp-text-preprocessing-tokenization-normalization-stemming-lemmatization-50ap</link>
      <guid>https://forem.com/themustaphatijani/the-complete-guide-to-nlp-text-preprocessing-tokenization-normalization-stemming-lemmatization-50ap</guid>
      <description>&lt;p&gt;&lt;em&gt;Natural Language Processing (NLP) powers today’s most advanced applications: intelligent search, sentiment analysis, chatbots, summarizers, recommendation engines, and large language models. But before any NLP system can understand text, the raw language must be cleaned, normalized, and transformed into structured formats that models can interpret.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Understanding the Importance of Text Preprocessing
&lt;/h2&gt;

&lt;p&gt;Raw text is messy. It contains punctuation, inconsistent capitalization, slang, typos, ambiguous words, and structure that machines cannot naturally interpret. Preprocessing transforms this messy input into a standardized, analyzable format.&lt;/p&gt;

&lt;p&gt;Why preprocessing matters:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;It improves model accuracy by reducing noise.&lt;/li&gt;
&lt;li&gt;It improves computational efficiency by reducing unnecessary text complexity.&lt;/li&gt;
&lt;li&gt;It increases consistency across datasets.&lt;/li&gt;
&lt;li&gt;It reveals the underlying structure of language, enabling better learning.&lt;/li&gt;
&lt;li&gt;It ensures models generalize well and avoid overfitting on noisy patterns.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The more carefully we preprocess, the better the downstream NLP model performs.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Tokenization
&lt;/h2&gt;

&lt;p&gt;Tokenization is the process of splitting text into meaningful units called tokens. These tokens can be words, subwords, or sentences depending on the task.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;

&lt;p&gt;Input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;love&lt;/span&gt; &lt;span class="n"&gt;learning&lt;/span&gt; &lt;span class="n"&gt;Natural&lt;/span&gt; &lt;span class="n"&gt;Language&lt;/span&gt; &lt;span class="n"&gt;Processing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Word tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;love&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;learning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Natural&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Language&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example (NLTK)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.tokenize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I love learning Natural Language Processing.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tokenization is the first step because every subsequent processing stage depends on these tokens.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Text Normalization
&lt;/h2&gt;

&lt;p&gt;Normalization eliminates inconsistencies in text, ensuring two syntactically different but semantically identical expressions are treated the same.&lt;/p&gt;

&lt;p&gt;Key techniques in normalization:&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Lowercasing
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NEW YORK&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new york&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 Removing punctuation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m happy!!!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;im happy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="n"&gt;clean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[^\w\s]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Removing numbers (optional)
&lt;/h3&gt;

&lt;p&gt;Useful when numbers add noise rather than meaning.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.4 Removing extra whitespace
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  NLP    is powerful  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Normalization helps models interpret text faster and more consistently.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Stopword Removal
&lt;/h2&gt;

&lt;p&gt;Stopwords are extremely frequent words that carry little semantic weight.&lt;/p&gt;

&lt;p&gt;Common English stopwords include:&lt;br&gt;
the, is, am, are, of, to, in, on, for, with&lt;/p&gt;
&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;

&lt;p&gt;Input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;am&lt;/span&gt; &lt;span class="n"&gt;going&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After stopword removal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;going&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;store&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.corpus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stopwords&lt;/span&gt;

&lt;span class="n"&gt;stop_words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stopwords&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;english&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;filtered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stop_words&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filtered&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stopword removal is particularly useful for document classification, clustering, and search tasks.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Stemming
&lt;/h2&gt;

&lt;p&gt;Stemming reduces a word to its base form using rule-based heuristics. It is fast but sometimes inaccurate because it does not consider context or grammar.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example transformations
&lt;/h3&gt;

&lt;p&gt;studies → studi&lt;br&gt;
learning → learn&lt;br&gt;
better → better&lt;/p&gt;
&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.stem&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PorterStemmer&lt;/span&gt;

&lt;span class="n"&gt;stemmer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PorterStemmer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;studies&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;studying&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;learned&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;better&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;stems&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;stemmer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stems&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Stemming is appropriate when speed matters more than linguistic accuracy.&lt;/p&gt;


&lt;h2&gt;
  
  
  6. Lemmatization
&lt;/h2&gt;

&lt;p&gt;Lemmatization uses vocabulary and grammar rules to reduce words to their meaningful base form, called a lemma. It is more accurate than stemming.&lt;/p&gt;
&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;p&gt;studies → study&lt;br&gt;
better → good&lt;br&gt;
mice → mouse&lt;/p&gt;
&lt;h3&gt;
  
  
  Example (WordNet)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.stem&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WordNetLemmatizer&lt;/span&gt;

&lt;span class="n"&gt;lemmatizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;WordNetLemmatizer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;studies&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;better&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;lemmas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;lemmatizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lemmatize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;better&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;lemmatizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lemmatize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;studies&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;lemmatizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lemmatize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lemmas&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Lemmatization is essential for tasks requiring linguistic correctness such as translation, summarization, and semantic similarity.&lt;/p&gt;


&lt;h2&gt;
  
  
  7. POS Tagging (Part-of-Speech Tagging)
&lt;/h2&gt;

&lt;p&gt;POS tagging assigns grammatical labels to each token. This step is crucial for correct lemmatization and contextual text analysis.&lt;/p&gt;
&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;

&lt;p&gt;The word "play" behaves differently depending on usage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;As a noun: "The play was interesting."&lt;/li&gt;
&lt;li&gt;As a verb: "The children play outside."&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The kids are playing outside&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pos_tag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;POS tags enable models to better understand sentence structure and meaning.&lt;/p&gt;


&lt;h2&gt;
  
  
  8. N-grams
&lt;/h2&gt;

&lt;p&gt;N-grams capture word sequences and preserve context that individual tokens may miss.&lt;/p&gt;
&lt;h3&gt;
  
  
  Examples
&lt;/h3&gt;

&lt;p&gt;Unigrams:&lt;br&gt;
love, machine, learning&lt;/p&gt;

&lt;p&gt;Bigrams:&lt;br&gt;
love machine, machine learning&lt;/p&gt;

&lt;p&gt;Trigrams:&lt;br&gt;
i love machine, love machine learning&lt;/p&gt;
&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.util&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ngrams&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I love machine learning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;bigrams&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;ngrams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bigrams&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;N-grams are frequently used in text classification, search ranking, and language modeling.&lt;/p&gt;


&lt;h2&gt;
  
  
  9. Text Vectorization (TF-IDF and Bag-of-Words)
&lt;/h2&gt;

&lt;p&gt;Machine learning models cannot operate on raw text. Vectorization transforms text into numerical features.&lt;/p&gt;
&lt;h3&gt;
  
  
  Example using TF-IDF
&lt;/h3&gt;

&lt;p&gt;TF-IDF measures how important a word is in a document relative to a corpus.&lt;/p&gt;
&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.feature_extraction.text&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TfidfVectorizer&lt;/span&gt;

&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I love machine learning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Machine learning loves data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;tfidf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TfidfVectorizer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tfidf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tfidf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_feature_names_out&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toarray&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;TF-IDF is widely used in search engines, recommendation systems, and keyword extraction.&lt;/p&gt;


&lt;h2&gt;
  
  
  10. Putting them together
&lt;/h2&gt;

&lt;p&gt;Below is a full pipeline combining tokenization, normalization, stopword removal, and lemmatization.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.corpus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stopwords&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.stem&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WordNetLemmatizer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk.tokenize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;preprocess&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Lowercase and remove punctuation
&lt;/span&gt;    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[^\w\s]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Tokenize
&lt;/span&gt;    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Remove stopwords
&lt;/span&gt;    &lt;span class="n"&gt;stop_words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stopwords&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;english&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stop_words&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Lemmatize
&lt;/span&gt;    &lt;span class="n"&gt;lemmatizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;WordNetLemmatizer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;lemmatizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lemmatize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;preprocess&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The cats are running in the gardens.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;['cat', 'running', 'garden']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the backbone of many NLP systems, from sentiment analysis engines to document retrieval systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  11. When to Use Each Technique
&lt;/h2&gt;

&lt;p&gt;Choosing the right preprocessing step depends on the task:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Recommended Steps&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sentiment Analysis&lt;/td&gt;
&lt;td&gt;Tokenization, normalization, stopwords (optional), lemmatization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Topic Modeling&lt;/td&gt;
&lt;td&gt;Tokenization, stopwords, lemmatization, n-grams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Machine Translation&lt;/td&gt;
&lt;td&gt;Tokenization, normalization, POS tagging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Search Engines&lt;/td&gt;
&lt;td&gt;Tokenization, stopwords, stemming or lemmatization, TF-IDF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deep Learning Models&lt;/td&gt;
&lt;td&gt;Minimal preprocessing (tokenization + normalization)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  12. Modern Tokenization
&lt;/h2&gt;

&lt;p&gt;Contemporary NLP models like GPT, BERT, and LLaMA use advanced tokenization techniques such as Byte-Pair Encoding (BPE) and SentencePiece.&lt;/p&gt;

&lt;p&gt;These models do not rely heavily on stopword removal, stemming, or lemmatization because they learn complex linguistic patterns directly from raw text.&lt;/p&gt;

&lt;p&gt;However, classical preprocessing remains essential for traditional ML pipelines and many industrial NLP workflows.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Text preprocessing is the foundation of every successful NLP project. By understanding tokenization, normalization, stopword removal, stemming, lemmatization, POS tagging, n-grams, and vectorization, you gain full control over how text is interpreted and transformed for machine learning.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>tutorial</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>From AI to NLP: The Four Phases of Language Understanding</title>
      <dc:creator>Mustapha Tijani</dc:creator>
      <pubDate>Wed, 05 Nov 2025 20:28:28 +0000</pubDate>
      <link>https://forem.com/themustaphatijani/from-ai-to-nlp-the-four-phases-of-language-understanding-3gp8</link>
      <guid>https://forem.com/themustaphatijani/from-ai-to-nlp-the-four-phases-of-language-understanding-3gp8</guid>
      <description>&lt;p&gt;&lt;em&gt;From handcrafted grammar rules to transformer models like GPT, the evolution of NLP tells a fascinating story about how machines learned to understand human language. This article breaks down the four major phases — rule-based, statistical, neural, and transformer — with code examples, pros, and cons.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;For decades, scientists have dreamed of teaching machines to understand human language. This journey — from simple rules to self-learning models — forms one of the most fascinating stories in artificial intelligence.&lt;/p&gt;

&lt;p&gt;At the heart of it is &lt;strong&gt;Natural Language Processing (NLP)&lt;/strong&gt;, a field where linguistics, computer science, and machine learning meet. It powers everything from Google Translate to ChatGPT, enabling computers to process, analyze, and generate language.&lt;/p&gt;

&lt;p&gt;But NLP wasn’t always as advanced as it is today. Its evolution happened in &lt;strong&gt;four main phases&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Rule-Based Systems&lt;/li&gt;
&lt;li&gt;Statistical Models&lt;/li&gt;
&lt;li&gt;Neural Networks&lt;/li&gt;
&lt;li&gt;Transformer Models&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let’s explore each one — how it worked, what it could do, and where it fell short.&lt;/p&gt;




&lt;h2&gt;
  
  
  Phase 1: Rule-Based NLP
&lt;/h2&gt;

&lt;p&gt;Before machine learning, NLP systems relied entirely on &lt;strong&gt;human-written rules&lt;/strong&gt; — grammar, syntax, and pattern-matching logic.&lt;/p&gt;

&lt;p&gt;These systems used dictionaries, if-else logic, and handcrafted linguistic rules to analyze or generate text.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: Regex-based Named Entity Recognition
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;My name is Mustapha and I live in Lagos, Nigeria.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Simple rule-based name detection
&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;My name is ([A-Z][a-z]+)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Detected Name:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Detected Name: Mustapha
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is simple — it “understands” language by &lt;em&gt;pattern&lt;/em&gt;, not meaning.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Easy to interpret and debug&lt;/li&gt;
&lt;li&gt;Works well for structured, predictable input&lt;/li&gt;
&lt;li&gt;Requires no data — only linguistic expertise&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Doesn’t generalize — new phrases break the rules&lt;/li&gt;
&lt;li&gt;Hard to scale across languages and domains&lt;/li&gt;
&lt;li&gt;No learning — it can’t improve with data&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 2: Statistical NLP
&lt;/h2&gt;

&lt;p&gt;In the 1990s, researchers realized that instead of &lt;em&gt;telling&lt;/em&gt; computers what language looks like, they could let them &lt;em&gt;learn&lt;/em&gt; from data.&lt;/p&gt;

&lt;p&gt;This gave birth to &lt;strong&gt;statistical NLP&lt;/strong&gt; — systems that used probabilities to predict text patterns.&lt;/p&gt;

&lt;p&gt;For example, a model could learn that the word “language” is often followed by “processing” or “model.”&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: N-gram Model
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bigrams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FreqDist&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Natural language processing makes machines understand human language.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c1"&gt;# Create bigrams (pairs of consecutive words)
&lt;/span&gt;&lt;span class="n"&gt;bi_grams&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;bigrams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Calculate frequency
&lt;/span&gt;&lt;span class="n"&gt;fdist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FreqDist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bi_grams&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fdist&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;most_common&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[(('natural', 'language'), 1), (('language', 'processing'), 1), (('processing', 'makes'), 1)]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, the machine “learns” which words tend to occur together — not by rule, but by &lt;em&gt;probability&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Learns from data automatically&lt;/li&gt;
&lt;li&gt;Adapts better to new text&lt;/li&gt;
&lt;li&gt;Enables machine translation, tagging, and speech recognition&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Requires large labeled datasets&lt;/li&gt;
&lt;li&gt;Struggles with long-distance dependencies in language&lt;/li&gt;
&lt;li&gt;No real understanding — just pattern counting&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 3: Neural Network NLP
&lt;/h2&gt;

&lt;p&gt;By the 2010s, computing power and data grew — and &lt;strong&gt;neural networks&lt;/strong&gt; began transforming NLP.&lt;/p&gt;

&lt;p&gt;Neural networks can model complex relationships and context better than statistical models. Instead of counting word pairs, they &lt;em&gt;embed&lt;/em&gt; words as numerical vectors — capturing meaning, not just frequency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: Word2Vec Embeddings
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;gensim.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Word2Vec&lt;/span&gt;

&lt;span class="n"&gt;sentences&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;king&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;man&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;woman&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;royal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;power&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Word2Vec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;min_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vector_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;most_similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;king&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output (example):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[('queen', 0.85), ('royal', 0.74)]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model learns that “king” and “queen” are similar in context — a form of &lt;em&gt;semantic understanding&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Learns semantic relationships&lt;/li&gt;
&lt;li&gt;Handles complex, unstructured language&lt;/li&gt;
&lt;li&gt;Generalizes better to unseen data&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Requires heavy computation&lt;/li&gt;
&lt;li&gt;Hard to interpret (black box)&lt;/li&gt;
&lt;li&gt;Needs large training data and GPUs&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Phase 4: The Transformer Era
&lt;/h2&gt;

&lt;p&gt;Then came the &lt;strong&gt;transformer models&lt;/strong&gt;, introduced in 2017 with the paper &lt;em&gt;“Attention Is All You Need.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Transformers changed everything — they process entire sentences at once, capturing relationships across long distances using a mechanism called &lt;strong&gt;self-attention&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Modern models like &lt;strong&gt;BERT&lt;/strong&gt;, &lt;strong&gt;GPT-3&lt;/strong&gt;, and &lt;strong&gt;T5&lt;/strong&gt; are built on this architecture. They can summarize, translate, answer questions, and even generate creative text.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: Using a Transformer for Text Generation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt;

&lt;span class="n"&gt;generator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-generation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Artificial intelligence is transforming&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_return_sequences&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;generated_text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output (sample):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Artificial intelligence is transforming the way humans think, work, and create new possibilities for innovation.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Understands context deeply and globally&lt;/li&gt;
&lt;li&gt;State-of-the-art performance across NLP tasks&lt;/li&gt;
&lt;li&gt;Pretrained on massive datasets — fine-tuning is easy&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Computationally expensive&lt;/li&gt;
&lt;li&gt;Requires huge amounts of data&lt;/li&gt;
&lt;li&gt;Can generate biased or inaccurate content&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion: From Rules to Reasoning
&lt;/h2&gt;

&lt;p&gt;The evolution of NLP — from &lt;strong&gt;rules&lt;/strong&gt; to &lt;strong&gt;statistics&lt;/strong&gt;, then &lt;strong&gt;neurons&lt;/strong&gt;, and finally &lt;strong&gt;transformers&lt;/strong&gt; — mirrors our own human learning.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rule-based&lt;/strong&gt; systems followed strict grammar.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Statistical&lt;/strong&gt; models learned from frequency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Neural&lt;/strong&gt; models captured meaning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transformers&lt;/strong&gt; learned context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each phase brought machines a little closer to understanding language the way humans do. And as transformers evolve into multimodal systems that process text, images, and sound, the next phase of NLP might finally feel... &lt;em&gt;human.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>nlp</category>
    </item>
    <item>
      <title>Don’t Be Someone People Have to Work With — Be Someone They Want to Work With</title>
      <dc:creator>Mustapha Tijani</dc:creator>
      <pubDate>Wed, 05 Nov 2025 19:38:00 +0000</pubDate>
      <link>https://forem.com/themustaphatijani/dont-be-someone-people-have-to-work-with-be-someone-they-want-to-work-with-4cb2</link>
      <guid>https://forem.com/themustaphatijani/dont-be-someone-people-have-to-work-with-be-someone-they-want-to-work-with-4cb2</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/themustaphatijani" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3565453%2F3fe00f03-7fa8-4080-9b39-0794b5af20fc.jpeg" alt="themustaphatijani"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/themustaphatijani/theres-a-difference-between-being-respected-and-being-liked-but-theres-a-special-kind-of-power-357i" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Don’t Be Someone People Have to Work With — Be Someone They Want to Work With&lt;/h2&gt;
      &lt;h3&gt;Mustapha Tijani ・ Oct 20&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#leadership&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#careerdevelopment&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#team&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#culture&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>leadership</category>
      <category>careerdevelopment</category>
      <category>team</category>
      <category>culture</category>
    </item>
    <item>
      <title>Don’t Be Someone People Have to Work With — Be Someone They Want to Work With</title>
      <dc:creator>Mustapha Tijani</dc:creator>
      <pubDate>Mon, 20 Oct 2025 11:00:00 +0000</pubDate>
      <link>https://forem.com/themustaphatijani/theres-a-difference-between-being-respected-and-being-liked-but-theres-a-special-kind-of-power-357i</link>
      <guid>https://forem.com/themustaphatijani/theres-a-difference-between-being-respected-and-being-liked-but-theres-a-special-kind-of-power-357i</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97jykjfl32hr1g5jnx8t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97jykjfl32hr1g5jnx8t.png" alt="There’s a difference between being respected and being liked. But there’s a special kind of power in being both." width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;There’s a difference between being respected and being liked. But there’s a special kind of power in being both.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the world of engineering, technical brilliance often takes the spotlight — clean code, optimized systems, and innovative architecture. But behind every successful product lies something more subtle: people who know how to work well together.&lt;/p&gt;

&lt;p&gt;Many developers start their careers believing that great work is about being right, defending their code, or proving their ideas. Over time, they learn that true success isn't about winning arguments — it's about helping the team win.&lt;/p&gt;

&lt;p&gt;I've worked with people who were technically brilliant but emotionally tone-deaf. I've also worked with others whose kindness and reliability made every project smoother.&lt;br&gt;&lt;br&gt;
The difference is night and day.&lt;/p&gt;

&lt;p&gt;Here are &lt;strong&gt;10 practical ways&lt;/strong&gt; to become the person people &lt;em&gt;want&lt;/em&gt; to work with — the kind of engineer who inspires trust, respect, and collaboration.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Let Go of the “Always Right” Mindset
&lt;/h2&gt;

&lt;p&gt;Being right means nothing if it damages trust. Listening doesn’t mean you’re wrong — it means you’re wise enough to see the bigger picture.&lt;/p&gt;

&lt;p&gt;The truth is, in engineering, there are often multiple right answers. Your approach might be elegant, but someone else’s might be simpler, faster, or more practical for the team.&lt;br&gt;&lt;br&gt;
Maturity comes from choosing collaboration over ego. Great engineers don’t just build features — they build alignment.&lt;/p&gt;

&lt;p&gt;When you start seeing debates as opportunities to learn, not to win, everything changes.&lt;br&gt;&lt;br&gt;
You stop defending your code and start defending your team’s success.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Make People Comfortable Asking You Questions
&lt;/h2&gt;

&lt;p&gt;If teammates hesitate before messaging you, something’s off.&lt;br&gt;&lt;br&gt;
Your tone and attitude can either make you a mentor or a wall.&lt;/p&gt;

&lt;p&gt;Knowledge hoarding might make you feel powerful for a while, but knowledge sharing makes you invaluable.&lt;/p&gt;

&lt;p&gt;When someone asks a “simple” question, remember that you were once in their shoes.&lt;br&gt;&lt;br&gt;
Explain things patiently, and if you sense recurring confusion, document it.&lt;/p&gt;

&lt;p&gt;The best developers don’t just write reusable code — they create reusable knowledge.&lt;br&gt;&lt;br&gt;
A team where people freely ask questions moves faster, breaks less, and builds better.&lt;br&gt;&lt;br&gt;
Your openness becomes part of the culture.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Deliver Without Drama
&lt;/h2&gt;

&lt;p&gt;Reliability is quiet power.&lt;br&gt;&lt;br&gt;
Everyone remembers the developer who delivers consistently — not with fireworks, but with calm focus.  &lt;/p&gt;

&lt;p&gt;They’re the ones who don’t panic when production breaks; they just roll up their sleeves and fix it.&lt;/p&gt;

&lt;p&gt;When you deliver without drama, you create psychological safety.&lt;br&gt;&lt;br&gt;
People know they can count on you. You become the anchor when things get chaotic — and every good team needs an anchor.&lt;/p&gt;

&lt;p&gt;This doesn’t mean suppressing your emotions; it means handling pressure with maturity.&lt;br&gt;&lt;br&gt;
You can say, “This is tough,” without turning it into tension.&lt;br&gt;&lt;br&gt;
That emotional balance is rare — and it’s magnetic.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Handle Delays with Empathy
&lt;/h2&gt;

&lt;p&gt;Sometimes your progress depends on others. Maybe design isn’t ready, or backend hasn’t exposed the endpoint. It’s easy to feel frustrated.&lt;br&gt;&lt;br&gt;
But empathy transforms delay into dialogue.&lt;/p&gt;

&lt;p&gt;Instead of snapping, ask, “How can I help move this forward?”&lt;br&gt;&lt;br&gt;
Maybe someone’s stuck, burnt out, or juggling too many tasks. Offering help — even just a small hand — turns blockers into breakthroughs.&lt;/p&gt;

&lt;p&gt;Empathy doesn’t make you weak; it makes you effective.&lt;br&gt;&lt;br&gt;
People remember how you treat them when things go wrong more than when everything’s perfect.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Show Up Early, Look Sharp
&lt;/h2&gt;

&lt;p&gt;Professionalism is underrated in tech.&lt;br&gt;&lt;br&gt;
Whether remote or hybrid, showing up early and looking put-together signals discipline.&lt;br&gt;&lt;br&gt;
You don’t do it to impress — you do it because you respect the work.&lt;/p&gt;

&lt;p&gt;I once worked with a developer who never missed a standup and was always prepared before anyone else.&lt;br&gt;&lt;br&gt;
Over time, his reliability became his brand. That’s what happens when consistency meets respect — people trust you more with bigger responsibilities.&lt;/p&gt;

&lt;p&gt;Even in a casual environment, “showing up” well — physically, mentally, and emotionally — sets the tone for excellence.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Be Prepared for Every Meeting
&lt;/h2&gt;

&lt;p&gt;Meetings are often where leadership visibility happens — even for non-leaders.&lt;br&gt;&lt;br&gt;
If you walk into meetings without clarity, you waste time.&lt;br&gt;&lt;br&gt;
But when you come in with points, blockers, and updates ready, you stand out.&lt;/p&gt;

&lt;p&gt;Preparation also means listening actively. Don’t be the person who waits for their name before paying attention.&lt;br&gt;&lt;br&gt;
Follow the flow, note key decisions, and offer insights when relevant.&lt;/p&gt;

&lt;p&gt;Your preparedness signals respect for others’ time.&lt;br&gt;&lt;br&gt;
Over time, people will start turning to you for context and clarity — because they know you always come ready.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Keep Your Word, Always
&lt;/h2&gt;

&lt;p&gt;If you say you’ll do something, do it. If something changes, communicate early.&lt;br&gt;&lt;br&gt;
Reliability is the quiet foundation of trust.&lt;/p&gt;

&lt;p&gt;Teams thrive when promises mean something.&lt;br&gt;&lt;br&gt;
When your word consistently aligns with your actions, you become a cornerstone of predictability — a rare and precious trait in fast-moving teams.&lt;/p&gt;

&lt;p&gt;Think of it like version control for your character — every fulfilled promise is a commit to your reputation.&lt;br&gt;&lt;br&gt;
Guard it carefully.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Bring Energy, Not Ego
&lt;/h2&gt;

&lt;p&gt;You don’t need to be the loudest voice in the room to lead.&lt;br&gt;&lt;br&gt;
But your energy matters — it’s contagious.&lt;/p&gt;

&lt;p&gt;A calm, positive presence often does more for morale than a 10x engineer’s brilliance.&lt;br&gt;&lt;br&gt;
Smiling, greeting people warmly, checking in with teammates — these small gestures compound into goodwill.&lt;br&gt;&lt;br&gt;
They make people look forward to working with you.&lt;/p&gt;

&lt;p&gt;Ego isolates. Energy connects.&lt;br&gt;&lt;br&gt;
Be the one who lifts the room, not drains it.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Code Like You Care About the Next Developer
&lt;/h2&gt;

&lt;p&gt;Readable, maintainable code is empathy in action.&lt;br&gt;&lt;br&gt;
It’s your future self’s thank-you note to your present self.&lt;/p&gt;

&lt;p&gt;Write clean functions, use meaningful variable names, leave clear comments.&lt;br&gt;&lt;br&gt;
Don’t just write for the compiler — write for the human who will debug it at 2 AM six months later.&lt;/p&gt;

&lt;p&gt;When people read your code and smile because it’s clear and considerate, that’s a form of craftsmanship.&lt;br&gt;&lt;br&gt;
That’s legacy-level work.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Lead by Example, Not Title
&lt;/h2&gt;

&lt;p&gt;Leadership isn’t about hierarchy — it’s about influence.&lt;br&gt;&lt;br&gt;
You lead when your actions inspire others to do better, even when nobody’s watching.&lt;/p&gt;

&lt;p&gt;Whether you’re a junior or a CTO, every project gives you a chance to model excellence: how you handle stress, feedback, failure, or success.&lt;br&gt;&lt;br&gt;
Your consistency and fairness will echo louder than any job title.&lt;/p&gt;

&lt;p&gt;The best leaders don’t need authority to command respect — their work and attitude speak for them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing Thought
&lt;/h2&gt;

&lt;p&gt;At the end of the day, nobody remembers how perfectly you optimized that query or structured that API.&lt;br&gt;&lt;br&gt;
They remember how you made them feel while doing it together.&lt;/p&gt;

&lt;p&gt;So don’t just be the person people have to work with — be the one they &lt;em&gt;want&lt;/em&gt; to work with.  &lt;/p&gt;

&lt;p&gt;That’s how great teams are built, and lasting careers are made.&lt;/p&gt;

</description>
      <category>leadership</category>
      <category>careerdevelopment</category>
      <category>team</category>
      <category>culture</category>
    </item>
    <item>
      <title>Coding as Poetry: Why Every Engineer Should Write Readable Code</title>
      <dc:creator>Mustapha Tijani</dc:creator>
      <pubDate>Tue, 14 Oct 2025 23:01:56 +0000</pubDate>
      <link>https://forem.com/themustaphatijani/coding-as-poetry-why-every-engineer-should-write-readable-code-4j91</link>
      <guid>https://forem.com/themustaphatijani/coding-as-poetry-why-every-engineer-should-write-readable-code-4j91</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpyrrhrdhr8v5664zzw2t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpyrrhrdhr8v5664zzw2t.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  I’ve built things. Now, I want to start writing.
&lt;/h2&gt;

&lt;p&gt;For years, I’ve lived behind the screen architecting systems, reviewing pull requests, optimizing APIs, and mentoring developers.&lt;br&gt;&lt;br&gt;
But recently, I’ve realized something powerful: &lt;em&gt;it’s not enough to build&lt;/em&gt;. It’s time to also &lt;strong&gt;document the journey&lt;/strong&gt; — the thoughts, mistakes, lessons, and philosophies that shape great engineering.&lt;/p&gt;

&lt;p&gt;So, here’s my first piece — a reflection on something I deeply believe:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Coding is poetry.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why coding is more than logic
&lt;/h2&gt;

&lt;p&gt;Every line of code carries rhythm, tone, and intent.&lt;br&gt;&lt;br&gt;
Good engineers don’t just solve problems; they express ideas elegantly.&lt;/p&gt;

&lt;p&gt;Readable code, like good poetry, &lt;strong&gt;transcends syntax&lt;/strong&gt;. It communicates meaning — to your teammates, to your future self, and to the next developer who inherits your work.  &lt;/p&gt;

&lt;p&gt;The best engineers write code that others can &lt;em&gt;feel&lt;/em&gt; and &lt;em&gt;understand&lt;/em&gt; — not just execute.&lt;/p&gt;

&lt;h2&gt;
  
  
  A lesson from breaking things
&lt;/h2&gt;

&lt;p&gt;I learned this the hard way.&lt;/p&gt;

&lt;p&gt;Early in my career, I made a seemingly small optimization — a tweak that improved performance but broke backward compatibility.&lt;br&gt;&lt;br&gt;
Production went down. The issue wasn’t my logic; it was my &lt;strong&gt;lack of empathy&lt;/strong&gt; for the unseen dependencies in the system.&lt;/p&gt;

&lt;p&gt;That day, my CTO pulled me aside and said something I’ll never forget:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Code isn’t just yours. It’s a conversation between everyone who will ever touch it. It’s a contract between you and anyone using your APIs either you know them or not.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those statements changed how I think about software forever. From that moment, I started writing every function like a sentence in a story; clear, intentional, and kind to the next reader.&lt;/p&gt;

&lt;h2&gt;
  
  
  Collaboration is a creative act
&lt;/h2&gt;

&lt;p&gt;I’ve worked with brilliant developers — some humble, some prideful.&lt;br&gt;&lt;br&gt;
But the strongest teams I’ve seen are the ones that treat coding as &lt;strong&gt;collaboration&lt;/strong&gt;, not competition.&lt;/p&gt;

&lt;p&gt;A great developer isn’t just one who writes performant code — it’s one who &lt;strong&gt;writes understandable code&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
The goal is not to impress; it’s to express.  &lt;/p&gt;

&lt;p&gt;When your code reads like poetry, others want to read it, improve it, and contribute to it. That’s how engineering cultures grow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coding is leadership
&lt;/h2&gt;

&lt;p&gt;Readable code is leadership in written form.&lt;br&gt;&lt;br&gt;
It’s how you lead your teammates, even when you’re not in the room.&lt;/p&gt;

&lt;p&gt;Leadership in engineering doesn’t always come from meetings or titles.&lt;br&gt;&lt;br&gt;
Sometimes it’s from a well-named variable, a clear comment, or a design pattern that makes someone whisper, &lt;em&gt;“Oh, now I get it.”&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing code that breathes
&lt;/h2&gt;

&lt;p&gt;Readable code has a pulse.&lt;br&gt;&lt;br&gt;
It’s consistent, predictable, and intentional.&lt;/p&gt;

&lt;p&gt;It doesn’t try to be clever; it tries to be kind.&lt;br&gt;&lt;br&gt;
It anticipates misunderstanding and guards against it.&lt;br&gt;&lt;br&gt;
It uses simplicity as a weapon against chaos.&lt;/p&gt;

&lt;p&gt;In short — &lt;strong&gt;clean code is kindness in action&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;We often celebrate the complexity of what we build — the architecture, the frameworks, the performance metrics.&lt;br&gt;&lt;br&gt;
But the real beauty lies in simplicity and clarity.&lt;/p&gt;

&lt;p&gt;Every time you open your editor, remember:&lt;br&gt;&lt;br&gt;
You’re not just writing instructions for a machine.&lt;br&gt;&lt;br&gt;
You’re writing poetry for humans who will walk your path.&lt;/p&gt;

&lt;p&gt;So write beautifully.&lt;br&gt;&lt;br&gt;
Write thoughtfully.&lt;br&gt;&lt;br&gt;
Write code that speaks.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the beginning of my writing journey as a software engineer and CTO. If this resonates with you, follow me — I’ll be sharing stories, lessons, and reflections on software engineering, leadership, and the art of building with clarity.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cleancode</category>
      <category>programming</category>
      <category>beginners</category>
      <category>software</category>
    </item>
  </channel>
</rss>
