<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Rahul @codingkite</title>
    <description>The latest articles on Forem by Rahul @codingkite (@codingkite).</description>
    <link>https://forem.com/codingkite</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F686572%2F84c57480-6670-4deb-8d1a-caa544c404e3.png</url>
      <title>Forem: Rahul @codingkite</title>
      <link>https://forem.com/codingkite</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/codingkite"/>
    <language>en</language>
    <item>
      <title>📘 Prompt Engineering: Mastering the Art of Talking to AI</title>
      <dc:creator>Rahul @codingkite</dc:creator>
      <pubDate>Sat, 19 Apr 2025 03:48:43 +0000</pubDate>
      <link>https://forem.com/codingkite/prompt-engineering-mastering-the-art-of-talking-to-ai-571j</link>
      <guid>https://forem.com/codingkite/prompt-engineering-mastering-the-art-of-talking-to-ai-571j</guid>
      <description>&lt;h2&gt;
  
  
  🧭 Topics Covered:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🗑️ GIGO: Garbage In, Garbage Out
&lt;/li&gt;
&lt;li&gt;✍️ What is Prompt Engineering?
&lt;/li&gt;
&lt;li&gt;🧠 How Prompts Influence LLM Output
&lt;/li&gt;
&lt;li&gt;🧰 Types &amp;amp; Styles of Prompts
&lt;/li&gt;
&lt;li&gt;🧪 Prompting Techniques (Zero-shot, Few-shot, CoT, etc.)
&lt;/li&gt;
&lt;li&gt;🧙‍♂️ Role, Persona &amp;amp; Contextual Prompting
&lt;/li&gt;
&lt;li&gt;🧩 Which Prompt Technique to Choose?
&lt;/li&gt;
&lt;li&gt;🔐 Prompt Templates &amp;amp; Security&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎬 Let’s Begin with a Story…
&lt;/h2&gt;

&lt;p&gt;Imagine you walk into a pizza shop 🍕 and say,  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Give me food.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The waiter looks puzzled.  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Uhh… what kind? Spicy? Veg? Size? Cheese?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now instead, you say,  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Can I get a large margherita pizza with extra cheese and jalapeños?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Boom 💥—now you're served exactly what you want. AI models work the same way. The better you explain your needs (prompt), the better the result (output)!&lt;/p&gt;

&lt;p&gt;And this is where &lt;strong&gt;Prompt Engineering&lt;/strong&gt; comes in.&lt;/p&gt;




&lt;h2&gt;
  
  
  🗑️ GIGO – Garbage In, Garbage Out
&lt;/h2&gt;

&lt;p&gt;Have you ever typed something into ChatGPT and got a weird or useless response?&lt;/p&gt;

&lt;p&gt;Well, that’s GIGO at play:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Garbage Input = Garbage Output&lt;/strong&gt; 💩&lt;/p&gt;

&lt;p&gt;AI is like a mirror—it reflects what you give it. Messy or vague input leads to confusing results. So crafting the right input is everything.&lt;/p&gt;


&lt;h2&gt;
  
  
  ✍️ What is Prompt Engineering?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Prompt Engineering&lt;/strong&gt; is the skill of writing &lt;em&gt;smart instructions&lt;/em&gt; (called prompts) to get &lt;em&gt;smart results&lt;/em&gt; from an AI model 🤖.&lt;/p&gt;

&lt;p&gt;Later in this blog we will learn about more technical stuff, like different prompt formats and techniques.&lt;/p&gt;


&lt;h3&gt;
  
  
  🤔 What is a Prompt?
&lt;/h3&gt;

&lt;p&gt;A prompt is the &lt;em&gt;initial instruction&lt;/em&gt; or input you give to the AI to perform a task.&lt;br&gt;&lt;br&gt;
But here's a catch…&lt;/p&gt;

&lt;p&gt;If you ask AI to generate a prompt, and then feed that prompt back to the AI, the results are often not great 😬. Why?&lt;/p&gt;

&lt;p&gt;Probabl because most LLMs (like GPT or Gemini) were trained on &lt;em&gt;human-written&lt;/em&gt; content—not AI-generated ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔑 Takeaway:&lt;/strong&gt; Always prefer writing your own prompts over relying on AI-generated ones.&lt;/p&gt;


&lt;h2&gt;
  
  
  🧠 System Prompts
&lt;/h2&gt;

&lt;p&gt;System prompts help set the &lt;strong&gt;initial context&lt;/strong&gt; for the conversation.&lt;/p&gt;

&lt;p&gt;As developers, we can’t control user queries, but we &lt;em&gt;can&lt;/em&gt; control the system prompt to steer the AI’s tone, behavior, or role 🎛️.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“You are a helpful travel assistant.” – That’s a system prompt.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Also, keep in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LLMs charge you based on both &lt;strong&gt;input&lt;/strong&gt; and &lt;strong&gt;output&lt;/strong&gt; tokens 💰.Checkout the pricing page of the model that you are using.&lt;/li&gt;
&lt;li&gt;Tokens should  &lt;em&gt;not&lt;/em&gt; be considered the same as words&lt;/li&gt;
&lt;li&gt;Repeating the same system prompt? It might be &lt;strong&gt;cached&lt;/strong&gt; and priced differently&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  ✨ Prompt Templates – Why Bother Using Them?
&lt;/h2&gt;

&lt;p&gt;Imagine sending raw user input straight to the AI. That’s risky!&lt;/p&gt;
&lt;h3&gt;
  
  
  🔒&lt;strong&gt;The Problem: Prompt Injection Attacks&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;One of the biggest vulnerabilities in LLMs today is prompt injection — where users sneak in inputs that hijack or manipulate the AI’s behavior.&lt;br&gt;
Think of it like someone whispering fake orders to your assistant while you’re not looking 😅&lt;/p&gt;
&lt;h3&gt;
  
  
  🛡️ &lt;strong&gt;The Fix: Prompt Templates&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Prompt templates let you &lt;strong&gt;structure conversations into clear roles&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System&lt;/strong&gt; – instructions from the developer
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User&lt;/strong&gt; – the actual user input
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Assistant&lt;/strong&gt; – the AI’s response
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This layered approach (like OpenAI’s &lt;strong&gt;ChatML format&lt;/strong&gt;) tells the model &lt;strong&gt;who is saying what&lt;/strong&gt;, and &lt;strong&gt;where one speaker stops and another begins&lt;/strong&gt;. That boundary is key 🔐&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This makes it much harder for malicious input to confuse or trick the model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  🧱 Why This Matters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Prompt templates &lt;strong&gt;reduce ambiguity&lt;/strong&gt;, helping LLMs interpret input more accurately
&lt;/li&gt;
&lt;li&gt;They &lt;strong&gt;separate trusted developer instructions&lt;/strong&gt; from unpredictable user text
&lt;/li&gt;
&lt;li&gt;Over time, this structure can help &lt;strong&gt;fully prevent injection attacks&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even when you give a simple instruction to an LLM, behind the scenes, it’s wrapped in a structured template — marking your role, your intent, and your context.&lt;/p&gt;


&lt;h2&gt;
  
  
  📐 Prompt Formats (Styles)
&lt;/h2&gt;

&lt;p&gt;Here are a few popular formats used in different LLMs:&lt;/p&gt;
&lt;h3&gt;
  
  
  🦙 Alpaca Prompt
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;### Instruction:&lt;/span&gt;
Do X

&lt;span class="gu"&gt;### Input:&lt;/span&gt;
With Y

&lt;span class="gu"&gt;### Response:&lt;/span&gt;
Result
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;  Instruction: For the given number by user perform arthematic operation
  Input: what is 2 + 2
  Response:
  ## the LLM will predict the next set of token and return 4.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  🦙 LLaMA-2 Format ( used by &lt;a href="https://www.llama.com/docs/model-cards-and-prompt-formats/meta-llama-2/" rel="noopener noreferrer"&gt;LLaMA-2&lt;/a&gt;) :
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;      &lt;span class="nt"&gt;&amp;lt;s&amp;gt;&lt;/span&gt; 
        [INST] 
          &lt;span class="nt"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;SYS&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;&amp;gt;
            {{ system_prompt }}
          &lt;span class="nt"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="na"&gt;SYS&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;&amp;gt;
            {{ user_message_1 }}
        [/INST]
        {{ model_answer_1 }}
      &lt;span class="nt"&gt;&amp;lt;/s&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;s&amp;gt;&lt;/span&gt;
        [INST]
          {{ user_message_2 }} 
        [/INST] 
      &lt;span class="nt"&gt;&amp;lt;/s&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  🦙 LLaMA-3 Format ( used by &lt;a href="https://www.llama.com/docs/model-cards-and-prompt-formats/meta-llama-3/" rel="noopener noreferrer"&gt;LLaMA-3&lt;/a&gt;)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;      &lt;span class="nt"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="na"&gt;begin_of_text&lt;/span&gt;&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="na"&gt;start_header_id&lt;/span&gt;&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
          system
        &lt;span class="nt"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="na"&gt;end_header_id&lt;/span&gt;&lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;&lt;span class="sb"&gt;

      You are a helpful AI assistant for travel tips and recommendations

      &amp;lt;|eot_id|&amp;gt;
        &amp;lt;|start_header_id|&amp;gt;
          user
        &amp;lt;|end_header_id|&amp;gt;

      What can you help me with?

      &amp;lt;|eot_id|&amp;gt;
        &amp;lt;|start_header_id|&amp;gt;
          assistant
        &amp;lt;|end_header_id|&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  💬 ChatML Format (used by OpenAI)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You are a helpful assistant"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"What is LRU cache?"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LRU stands for..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🛠️ Prompting Techniques
&lt;/h2&gt;

&lt;p&gt;Let’s explore the &lt;strong&gt;ways&lt;/strong&gt; you can craft prompts:&lt;/p&gt;
&lt;h3&gt;
  
  
  1. 🕵️ Zero-Shot Prompting
&lt;/h3&gt;

&lt;p&gt;Just ask the question without giving any examples.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Write a cold email introducing our new app.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;AI uses its existing knowledge. Good for quick tasks. No examples needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;api_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what is 5*45+34%3*2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;   
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI response -&amp;gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;api_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2. ✌️ Few-Shot Prompting
&lt;/h3&gt;

&lt;p&gt;Here, you give a few examples first, then ask for a new answer.&lt;/p&gt;

&lt;p&gt;Helps improve accuracy when the task is nuanced or requires understanding a pattern.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;
                    You are an AI assistant which helper user in solving mathematical question.
                    Any question other than the mathematical question should not be answered by you.

                    Example:
                    Input: 2+2
                    Output: 2+2 is 4 which is calculate by adding 2 + 2

                    Input: 3*0+5
                    Output: 3*0+5 is 5. As per rule we first multiply and then add. So 3*0 is 0 and 0+5 is 5 which is calculated by first multiplying 3 with 0 and then adding the result with 5

                    Input: why is sky blue?
                    Output: Is this maths query? I am an mathematic assistant and can help you in mathematics only.

                &lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;
&lt;span class="n"&gt;api_response_1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;## adjust to control pricing by limiting the token count
&lt;/span&gt;    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;## adjust temperature to add more creativity/randmoness to output
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what is 5*45+34%3*2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;   
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI response 1 -&amp;gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;api_response_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;api_response_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s the speed at which cheetah can run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;   
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI response 2 -&amp;gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;api_response_2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazylhb9904cqd27cv4xc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazylhb9904cqd27cv4xc.png" alt="Api response 1" width="800" height="139"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9um89gy41rhli4dratc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9um89gy41rhli4dratc.png" alt="Api response 2" width="800" height="25"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  3. 🔗 Chain-of-Thought (CoT)
&lt;/h3&gt;

&lt;p&gt;Here, we ask the model to &lt;strong&gt;explain step-by-step&lt;/strong&gt; before giving the answer.&lt;br&gt;
The model is encroughed to break down reasoning step by step before arriving at an answer.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Let’s break it down: First we..., then we...”  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This improves accuracy and makes AI reasoning more transparent 🧠&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;api_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="c1"&gt;# Initialize response variable
&lt;/span&gt;
&lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;
    You are an AI assistant who is expert in breaking down complex problems
    and then resolve the user query.

    For the given user input, analyse the input and break down the problem step by step.
    Atleast think 5-6 steps on how to resolve the problem before solving the problem.

    The steps are you get &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, you &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analyse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, you &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;think&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; several times and then return the output with explanation.
    Finally you validate the ouput before giving the final &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.

    Follow these step in sequence &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analsyse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;think&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; and finally &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.

    Rules:
        1. Follow the strict JSON ouput as per Output schema.
        2. Always perform one step at a time and wait for next input
        3. Carefully analyze the user query.

    Output Fromat:
        {{step :&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,content:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}}

    Example:
    Input : what is 2+2?
    Output : {{step:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analyse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,content:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The user is intrresetd in basic maths query and he is asking basic arthematice operation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}}   
    Output : {{step:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;think&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,content:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;To perform addition one must got from left to rigth and add all the operands&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}}
    Output : {{step:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,content:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}}
    Output : {{step:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,content:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Seems like 4 is correct as 2+2 adds up to 4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}}
    Output : {{step:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,content:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2 + 2 = 4 and that is calculated by adding all numbers.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}}

&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;

&lt;span class="n"&gt;api_response&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;user_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;user_query&lt;/span&gt;&lt;span class="p"&gt;},)&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;api_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;response_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json_object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;parsed_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed_response&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;parsed_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;step&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;each step -&amp;gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;parsed_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Parsed Respose : &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;parsed_response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;break&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F53xezw6sm4qhfl6wv0fl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F53xezw6sm4qhfl6wv0fl.png" alt="AI Response 1" width="800" height="106"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxh0b4pcrt0yeerf31nj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxh0b4pcrt0yeerf31nj.png" alt="AI Response 2" width="800" height="131"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  4. 🔁 Self-Consistency Prompting
&lt;/h3&gt;

&lt;p&gt;Run the same prompt multiple times. Pick the &lt;strong&gt;most common or logical&lt;/strong&gt; answer.&lt;/p&gt;

&lt;p&gt;Just like asking 5 friends and trusting the one most agree on!&lt;/p&gt;




&lt;h3&gt;
  
  
  5. 🧑‍🎓 Persona-Based Prompting
&lt;/h3&gt;

&lt;p&gt;Give the AI a &lt;strong&gt;personality or an profession&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“You are a doctor giving tips to new parents.”&lt;br&gt;&lt;br&gt;
It shapes how the AI responds!&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  6. 🎭 Role-Playing Prompt
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;“You are an expert coding tutor for beginners.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Let the AI act in character 🎬 and adapt to the role you've assigned.&lt;/p&gt;




&lt;p&gt;As we go deeper, we’ll explore advanced prompt engineering strategies like:&lt;/p&gt;

&lt;p&gt;1.🔍 Contextual Prompting&lt;br&gt;&lt;br&gt;
  2.🖼️ Multimodal Prompting&lt;/p&gt;

&lt;p&gt;These techniques go beyond just writing smart instructions — they need an &lt;strong&gt;orchestrator&lt;/strong&gt; behind the scenes.&lt;br&gt;&lt;br&gt;
Think of the orchestrator as a &lt;strong&gt;conductor&lt;/strong&gt;, managing how data flows into and out of the LLM for maximum accuracy and relevance.&lt;br&gt;
These technique are used in apps that require deep context like chatbots, search assistants, etc.&lt;/p&gt;

&lt;p&gt;To make this work, we’ll integrate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vector Databases&lt;/strong&gt; – to provide semantic context and memory
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph Databases&lt;/strong&gt; – to model relationships between entities
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt; – for handling structured data
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool / Function Calling&lt;/strong&gt; – so the model can dynamically execute actions in real-time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’ll learn how to stitch these together to build powerful, context-aware, multi-modal AI systems in upcoming blog.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧪 How to Choose the Right Prompt Technique?
&lt;/h2&gt;

&lt;p&gt;Here’s the secret sauce:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Experiment. Track. Improve.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Observe how your app responds to real user queries
&lt;/li&gt;
&lt;li&gt;Mix and match techniques like:

&lt;ul&gt;
&lt;li&gt;CoT + Role-Play + Persona 🤯
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Use observability tools to capture and analyze bad vs. good outputs to tweak your prompt technique accordingly.&lt;/li&gt;

&lt;li&gt;Keep refining over time&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎯 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Prompt Engineering isn’t just about giving commands to AI.&lt;br&gt;&lt;br&gt;
It’s about speaking its language clearly and cleverly 💡&lt;/p&gt;

&lt;p&gt;If you're building AI tools, learning how to write great prompts will make your results 10x better.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Great prompts = Great products 🚀&lt;/p&gt;
&lt;/blockquote&gt;




</description>
      <category>promptengineering</category>
      <category>ai</category>
      <category>chaicode</category>
    </item>
    <item>
      <title>Generative AI Jargons You Should Know</title>
      <dc:creator>Rahul @codingkite</dc:creator>
      <pubDate>Sun, 13 Apr 2025 10:29:22 +0000</pubDate>
      <link>https://forem.com/codingkite/generative-ai-jargons-you-should-know-1c2d</link>
      <guid>https://forem.com/codingkite/generative-ai-jargons-you-should-know-1c2d</guid>
      <description>&lt;h1&gt;
  
  
  🤖 Ever Wondered How ChatGPT or Gemini Works?
&lt;/h1&gt;

&lt;p&gt;We all have used AI tools like ChatGPT or Gemini, but have you ever wondered how these tools are able to generate such accurate responses to our queries? 🤔&lt;/p&gt;

&lt;p&gt;In this blog, we’ll get an &lt;strong&gt;overview of how AI models generate responses&lt;/strong&gt; — and along the way, we’ll learn some &lt;strong&gt;jargon&lt;/strong&gt; 🧠 that you often see floating around the internet 🌐.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧩 AI Jargon You’ll Learn in This Blog
&lt;/h2&gt;

&lt;p&gt;Let’s walk through the working of an AI model while exploring these terms:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;🔤 LLM
&lt;/li&gt;
&lt;li&gt;🧠 GPT (Generative Pre-trained Transformer)
&lt;/li&gt;
&lt;li&gt;⚙️ Transformer
&lt;/li&gt;
&lt;li&gt;🧱 Tokens
&lt;/li&gt;
&lt;li&gt;📥 Encoder / Encoding
&lt;/li&gt;
&lt;li&gt;📍 Positional Encoding
&lt;/li&gt;
&lt;li&gt;📤 Decoder / Decoding
&lt;/li&gt;
&lt;li&gt;🧮 Vectors
&lt;/li&gt;
&lt;li&gt;🔗 Embedding
&lt;/li&gt;
&lt;li&gt;🧠 Semantic Meaning
&lt;/li&gt;
&lt;li&gt;👁️ Self Attention
&lt;/li&gt;
&lt;li&gt;🎯 SoftMax
&lt;/li&gt;
&lt;li&gt;🧠 Multi-Head Attention
&lt;/li&gt;
&lt;li&gt;🌡️ Temperature
&lt;/li&gt;
&lt;li&gt;📅 Knowledge Cutoff
&lt;/li&gt;
&lt;li&gt;✂️ Tokenization
&lt;/li&gt;
&lt;li&gt;📚 Vocab Size
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🤔 What is AI?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;AI is basically an &lt;strong&gt;algorithm trained on data&lt;/strong&gt;. After training, it generates output based on &lt;strong&gt;learned weights&lt;/strong&gt; in response to a user query.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ⚙️ Transformer
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;An &lt;strong&gt;AI model&lt;/strong&gt; is a mathematical structure that has learned patterns from data.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Transformer&lt;/strong&gt; is a type of deep learning architecture that’s especially good at understanding sequences like &lt;strong&gt;text&lt;/strong&gt; or &lt;strong&gt;audio&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;It was introduced by Google in 2017 in a paper called:
📄 &lt;strong&gt;&lt;a href="https://arxiv.org/pdf/1706.03762" rel="noopener noreferrer"&gt;Attention is All You Need&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🤖 GPT
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GPT&lt;/strong&gt; stands for &lt;strong&gt;Generative Pre-trained Transformer&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;It’s a &lt;strong&gt;pre-trained transformer&lt;/strong&gt; that &lt;strong&gt;generates the next token&lt;/strong&gt; based on the data it has seen.&lt;/li&gt;
&lt;li&gt;🏋️ Training GPT is expensive and time-consuming, so it’s not retrained frequently.&lt;/li&gt;
&lt;li&gt;Thus, GPT has a &lt;strong&gt;knowledge cutoff&lt;/strong&gt; — it doesn’t know anything that happened after its last training.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🔍 &lt;strong&gt;Fun Fact:&lt;/strong&gt; ChatGPT combines GPT with an &lt;strong&gt;agent&lt;/strong&gt; system in the background.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧱 Transformer Model Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkm05h87v9qpwxcio86fh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkm05h87v9qpwxcio86fh.png" alt="Transformer Model Architecture Diagram" width="782" height="1152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At first glance, this architecture might seem scary 😨, but it becomes simple when explained properly. Let’s break it down:&lt;/p&gt;




&lt;h3&gt;
  
  
  🧾 Input Query &amp;amp; Tokenization
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt;: The query provided by the user.&lt;/li&gt;
&lt;li&gt;AI models like LLMs don’t understand human languages directly — they understand &lt;strong&gt;numbers&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;So, we convert the input into numbers — this process is called &lt;strong&gt;Tokenization&lt;/strong&gt;.

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;token&lt;/strong&gt; is a word or piece of a word.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;token ID&lt;/strong&gt; is the numeric representation.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Libraries like &lt;strong&gt;tiktoken&lt;/strong&gt; (used by OpenAI) perform tokenization.&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;🔤 &lt;strong&gt;Vocabulary Size&lt;/strong&gt; = Total number of unique tokens in the tokenizer's dictionary.&lt;/p&gt;




&lt;h3&gt;
  
  
  🧰 Tokenizer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A tokenizer is &lt;strong&gt;separate&lt;/strong&gt; from the model.&lt;/li&gt;
&lt;li&gt;It has a &lt;strong&gt;fixed vocabulary&lt;/strong&gt;: a mapping of words (tokens) to numbers (token IDs).&lt;/li&gt;
&lt;li&gt;The model doesn’t “know” words — it only understands token IDs.&lt;/li&gt;
&lt;li&gt;During inference:

&lt;ul&gt;
&lt;li&gt;If a &lt;strong&gt;new word&lt;/strong&gt; appears, the tokenizer:&lt;/li&gt;
&lt;li&gt;Breaks it into &lt;strong&gt;known sub-tokens&lt;/strong&gt;, or
&lt;/li&gt;
&lt;li&gt;Uses an &lt;strong&gt;unknown token&lt;/strong&gt; placeholder.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;So, &lt;strong&gt;vocab size&lt;/strong&gt; = number of &lt;strong&gt;unique tokens&lt;/strong&gt; the model can recognize.&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;👉 Tokenizer visualizer: &lt;a href="https://tiktokenizer.vercel.app/" rel="noopener noreferrer"&gt;tiktokenizer.vercel.app&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  🧪 Tokenizer Code Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tiktoken._educational&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;

&lt;span class="n"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_encoding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;o200k_base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;encoder&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vocab size&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_vocab&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Hello World, How are you?&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;decoded_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decoded_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decoded_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tiktoken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encoding_for_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;encoder_for_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encoding_for_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;encoder_for_model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoder_for_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Train a BPE tokeniser on a small amount of text
&lt;/span&gt;&lt;span class="n"&gt;enc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_simple_encoding&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;enc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Visualise how the GPT-4 encoder encodes text
&lt;/span&gt;&lt;span class="n"&gt;another_encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SimpleBytePairEncoding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_tiktoken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cl100k_base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;another_encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hello world aaaaaaaaaaaa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🧠 Embeddings (Vector &amp;amp; Positional)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🧮 Vector Embedding
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;vector embedding&lt;/strong&gt; is the &lt;strong&gt;numerical representation of tokens&lt;/strong&gt; that captures &lt;strong&gt;semantic meaning&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Semantic meaning = meaning of the word in a &lt;strong&gt;specific context&lt;/strong&gt;.

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;"Reserve Bank"&lt;/em&gt; vs &lt;em&gt;"Bank of a river"&lt;/em&gt; — same word, different meanings.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;✨ Imagine "cat" as a point in space: &lt;code&gt;[0.2, 1.3, -0.5, ...]&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;These embeddings can be stored in vector databases like:

&lt;ul&gt;
&lt;li&gt;🔍 &lt;strong&gt;Pinecone&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;🌌 &lt;strong&gt;Chroma DB&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;🧭 &lt;strong&gt;Qdrant&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;🧭 Visualize embeddings here: &lt;a href="https://projector.tensorflow.org/" rel="noopener noreferrer"&gt;TensorFlow Projector&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  🧠 Semantic Example
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;King ➡️ Queen&lt;/em&gt; implies &lt;em&gt;Man ➡️ ?&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;If "Queen" is 3 units down and "Man" is 4 units left of "King" in vector space — then the model can estimate the missing word using vector math ✨.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4hs92q3hjurjby6p9q4x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4hs92q3hjurjby6p9q4x.png" alt="Embedding Visualization" width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  📍 Positional Embedding
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Tokens alone &lt;strong&gt;don’t carry position info&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Two sentences with the &lt;strong&gt;same words but different order&lt;/strong&gt; mean very different things!

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;"The cat sat on the mat"&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"The mat sat on the cat"&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;So, we use &lt;strong&gt;positional embeddings&lt;/strong&gt; to give the model &lt;strong&gt;context of order&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;🛠️ These modify the original embedding to reflect &lt;strong&gt;word position&lt;/strong&gt; in the sentence.&lt;/p&gt;




&lt;h3&gt;
  
  
  📥 Embedding Example with OpenAI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Eiffel tower is in Paris and is a famous landmark, it is 324 meters tall&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-ada-002&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector embeddings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔄 &lt;strong&gt;Self Attention Mechanism&lt;/strong&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;In simple terms: &lt;strong&gt;Tokens can talk to each other and update themselves!&lt;/strong&gt; 🧠💬&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;This means when two tokens interact, they can update their &lt;strong&gt;vector embeddings&lt;/strong&gt; based on the sentence &lt;strong&gt;and&lt;/strong&gt; with respect to each other.&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;"river bank"&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;When these 2 tokens talk to each other, they update their embeddings based on the context.&lt;/li&gt;
&lt;li&gt;So even if the word &lt;code&gt;"bank"&lt;/code&gt; appears in both &lt;code&gt;"river bank"&lt;/code&gt; and &lt;code&gt;"icici bank"&lt;/code&gt;, and both have:&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;same token&lt;/strong&gt;,
&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;same original vector embedding&lt;/strong&gt;,
&lt;/li&gt;
&lt;li&gt;and the &lt;strong&gt;same positional embedding&lt;/strong&gt;,
&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;final vector embedding will differ&lt;/strong&gt; because of how they interact with other tokens in the sentence.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Tokens update their embeddings based on &lt;strong&gt;all tokens&lt;/strong&gt; in the sentence — not just one or two.&lt;/li&gt;

&lt;li&gt;So what does self-attention do?

&lt;ul&gt;
&lt;li&gt;It allows &lt;strong&gt;tokens to adjust their embeddings&lt;/strong&gt; based on the other tokens present in the input. 🔁&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧠 &lt;strong&gt;Multi-Head Attention — Seeing Things in Many Ways&lt;/strong&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Think of it like observing the sentence from &lt;strong&gt;multiple perspectives at once!&lt;/strong&gt; 👀🔍&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;It helps the model focus on different &lt;strong&gt;aspects/perspectives&lt;/strong&gt; of the tokens &lt;strong&gt;simultaneously&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;At its heart, attention is about &lt;strong&gt;weighing the importance&lt;/strong&gt; of different input tokens when focusing on a specific one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-head attention&lt;/strong&gt; runs &lt;strong&gt;multiple attention operations in parallel&lt;/strong&gt; (called “heads”) and &lt;strong&gt;combines their results&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;This allows the model to learn &lt;strong&gt;various types of relationships&lt;/strong&gt; between tokens — all at the same time. 💡🧩&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔗 &lt;strong&gt;Feed Forward&lt;/strong&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;It’s a neural network that processes each token &lt;strong&gt;individually&lt;/strong&gt; after attention is done. 🛠️&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;The interaction cycle between &lt;strong&gt;Multi-Head Attention&lt;/strong&gt; and &lt;strong&gt;Feed Forward&lt;/strong&gt; is repeated many times to get a rich contextual result.&lt;/li&gt;
&lt;li&gt;In GPT (and Transformers in general), after using &lt;strong&gt;attention&lt;/strong&gt; to understand word relationships, the model sends the result through a &lt;strong&gt;Feed Forward Neural Network (FFN)&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🧠 "Okay, I understand which words matter (thanks to attention)...&lt;br&gt;&lt;br&gt;
Now let me do some &lt;strong&gt;math&lt;/strong&gt; on each word to extract more meaning!" ➗🔍&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  ⚙️ &lt;strong&gt;How It Works&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Feed Forward block&lt;/strong&gt; is just a small, regular neural network applied &lt;strong&gt;to each token individually&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Here's what happens:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;Each token (word embedding) goes into a &lt;strong&gt;Linear layer&lt;/strong&gt; (fully connected).&lt;/li&gt;
&lt;li&gt;Then through a &lt;strong&gt;ReLU&lt;/strong&gt; or &lt;strong&gt;GELU&lt;/strong&gt; activation (for non-linearity).&lt;/li&gt;
&lt;li&gt;Then again through &lt;strong&gt;another Linear layer&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The output replaces the old one and continues through the Transformer stack.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;🧱 Formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FFN(token) = Linear -&amp;gt; Activation -&amp;gt; Linear
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🧠 &lt;strong&gt;Why Is It Important?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Attention&lt;/strong&gt; handles relationships &lt;strong&gt;between words&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feed Forward&lt;/strong&gt; handles &lt;strong&gt;processing each word by itself&lt;/strong&gt;, like extracting deeper meanings and features. 🌱&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🔁 &lt;strong&gt;Simple Analogy&lt;/strong&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Attention&lt;/strong&gt;: "Hey 'cat', pay attention to 'mat' and 'sat'!" 🐱🧘🪑&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Feed Forward&lt;/strong&gt;: "Cool. Now let me upgrade 'cat' with better features based on that context." 🚀📈&lt;/p&gt;
&lt;/blockquote&gt;







&lt;h2&gt;
  
  
  🔄 &lt;strong&gt;Self Attention Mechanism&lt;/strong&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;In simple terms: &lt;strong&gt;Tokens can talk to each other and update themselves!&lt;/strong&gt; 🧠💬&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;This means when two tokens interact, they can update their &lt;strong&gt;vector embeddings&lt;/strong&gt; based on the sentence &lt;strong&gt;and&lt;/strong&gt; with respect to each other.&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;"river bank"&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;When these 2 tokens talk to each other, they update their embeddings based on the context.&lt;/li&gt;
&lt;li&gt;So even if the word &lt;code&gt;"bank"&lt;/code&gt; appears in both &lt;code&gt;"river bank"&lt;/code&gt; and &lt;code&gt;"icici bank"&lt;/code&gt;, and both have:&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;same token&lt;/strong&gt;,
&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;same original vector embedding&lt;/strong&gt;,
&lt;/li&gt;
&lt;li&gt;and the &lt;strong&gt;same positional embedding&lt;/strong&gt;,
&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;final vector embedding will differ&lt;/strong&gt; because of how they interact with other tokens in the sentence.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Tokens update their embeddings based on &lt;strong&gt;all tokens&lt;/strong&gt; in the sentence — not just one or two.&lt;/li&gt;

&lt;li&gt;So what does self-attention do?

&lt;ul&gt;
&lt;li&gt;It allows &lt;strong&gt;tokens to adjust their embeddings&lt;/strong&gt; based on the other tokens present in the input. 🔁&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧠 &lt;strong&gt;Multi-Head Attention — Seeing Things in Many Ways&lt;/strong&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Think of it like observing the sentence from &lt;strong&gt;multiple perspectives at once!&lt;/strong&gt; 👀🔍&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;It helps the model focus on different &lt;strong&gt;aspects/perspectives&lt;/strong&gt; of the tokens &lt;strong&gt;simultaneously&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;At its heart, attention is about &lt;strong&gt;weighing the importance&lt;/strong&gt; of different input tokens when focusing on a specific one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-head attention&lt;/strong&gt; runs &lt;strong&gt;multiple attention operations in parallel&lt;/strong&gt; (called “heads”) and &lt;strong&gt;combines their results&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;This allows the model to learn &lt;strong&gt;various types of relationships&lt;/strong&gt; between tokens — all at the same time. 💡🧩&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔗 &lt;strong&gt;Feed Forward&lt;/strong&gt;
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;It’s a neural network that processes each token &lt;strong&gt;individually&lt;/strong&gt; after attention is done. 🛠️&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;The interaction cycle between &lt;strong&gt;Multi-Head Attention&lt;/strong&gt; and &lt;strong&gt;Feed Forward&lt;/strong&gt; is repeated many times to get a rich contextual result.&lt;/li&gt;
&lt;li&gt;In GPT (and Transformers in general), after using &lt;strong&gt;attention&lt;/strong&gt; to understand word relationships, the model sends the result through a &lt;strong&gt;Feed Forward Neural Network (FFN)&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🧠 "Okay, I understand which words matter (thanks to attention)...&lt;br&gt;&lt;br&gt;
Now let me do some &lt;strong&gt;math&lt;/strong&gt; on each word to extract more meaning!" ➗🔍&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  ⚙️ &lt;strong&gt;How It Works&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Feed Forward block&lt;/strong&gt; is just a small, regular neural network applied &lt;strong&gt;to each token individually&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Here's what happens:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;Each token (word embedding) goes into a &lt;strong&gt;Linear layer&lt;/strong&gt; (fully connected).&lt;/li&gt;
&lt;li&gt;Then through a &lt;strong&gt;ReLU&lt;/strong&gt; or &lt;strong&gt;GELU&lt;/strong&gt; activation (for non-linearity).&lt;/li&gt;
&lt;li&gt;Then again through &lt;strong&gt;another Linear layer&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The output replaces the old one and continues through the Transformer stack.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;🧱 Formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FFN(token) = Linear -&amp;gt; Activation -&amp;gt; Linear
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  🧠 &lt;strong&gt;Why Is It Important?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Attention&lt;/strong&gt; handles relationships &lt;strong&gt;between words&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feed Forward&lt;/strong&gt; handles &lt;strong&gt;processing each word by itself&lt;/strong&gt;, like extracting deeper meanings and features. 🌱&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🔁 &lt;strong&gt;Simple Analogy&lt;/strong&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Attention&lt;/strong&gt;: "Hey 'cat', pay attention to 'mat' and 'sat'!" 🐱🧘🪑&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Feed Forward&lt;/strong&gt;: "Cool. Now let me upgrade 'cat' with better features based on that context." 🚀📈&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🚀 &lt;strong&gt;Two Phases of a Model&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🔧 &lt;strong&gt;Training Phase&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;🧠 &lt;strong&gt;Inference Phase&lt;/strong&gt; (using phase)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🔧 &lt;strong&gt;Training Phase&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Let’s break it down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In the training phase, we match &lt;strong&gt;input&lt;/strong&gt; and &lt;strong&gt;output&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;We provide the &lt;strong&gt;input&lt;/strong&gt;, the model gives us an &lt;strong&gt;output&lt;/strong&gt;, we compare it with the &lt;strong&gt;actual output&lt;/strong&gt;, calculate the &lt;strong&gt;loss&lt;/strong&gt;, and then &lt;strong&gt;backpropagate&lt;/strong&gt; it (&lt;em&gt;backpropagation = वapas jao 😄&lt;/em&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Example Flow:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt;: &lt;code&gt;&amp;lt;start&amp;gt; my name is piyush&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Actual Output&lt;/strong&gt;: &lt;code&gt;&amp;lt;start&amp;gt; my name is piyush &amp;lt;end&amp;gt; I am good&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model Output&lt;/strong&gt;: &lt;code&gt;&amp;lt;start&amp;gt; my name is piyush xsfd@e&lt;/code&gt; 😅&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We calculate the &lt;strong&gt;loss&lt;/strong&gt; between the model’s output and the actual output.&lt;/li&gt;
&lt;li&gt;We send it back through the model to adjust the weights.&lt;/li&gt;
&lt;li&gt;We repeat this until the model starts giving the expected output. 🔁&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✅ &lt;strong&gt;Goal&lt;/strong&gt;: Allow the model to update its &lt;strong&gt;weights&lt;/strong&gt; using the training data.&lt;br&gt;&lt;br&gt;
⚙️ This backpropagation and weight update process requires &lt;strong&gt;a lot of compute power&lt;/strong&gt;, hence &lt;strong&gt;heavy GPU usage&lt;/strong&gt; 💻🔥.&lt;/p&gt;




&lt;h3&gt;
  
  
  🧠 &lt;strong&gt;Inference Phase&lt;/strong&gt;: &lt;em&gt;Using the model&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;Time to use what we've trained! 😎&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We provide an input to the model:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;token&lt;/code&gt; → &lt;code&gt;vector embedding&lt;/code&gt; ➕ &lt;code&gt;positional embedding&lt;/code&gt; → 🎯 &lt;code&gt;multi-head attention&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;⚡ Technically, the model generates &lt;strong&gt;multiple possible outputs&lt;/strong&gt; for a given input.&lt;/p&gt;




&lt;h4&gt;
  
  
  🧪 Example:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt;: &lt;code&gt;&amp;lt;start&amp;gt; how are you?&amp;lt;end&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Raw Outputs&lt;/strong&gt;: I, S, U&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;linear&lt;/code&gt; step calculates &lt;strong&gt;probabilities&lt;/strong&gt; for each token.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example Output (with probabilities)&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;I (98%) ✅&lt;/li&gt;
&lt;li&gt;S (5%)&lt;/li&gt;
&lt;li&gt;U (0.3%)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;Softmax&lt;/code&gt; step picks the one with the &lt;strong&gt;highest probability&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🎛️ &lt;strong&gt;Temperature Parameter&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Used by &lt;code&gt;Softmax&lt;/code&gt; to control randomness.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Higher temperature&lt;/strong&gt; → more randomness (might pick a less probable token).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lower temperature&lt;/strong&gt; → more deterministic (sticks to highest probability).&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h4&gt;
  
  
  🪜 Steps Breakdown
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;STEP 1&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Input&lt;/em&gt;: &lt;code&gt;&amp;lt;start&amp;gt; how are you?&amp;lt;end&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Probabilities&lt;/em&gt;: I (98%), S (5%), U (0.3%)&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Chosen&lt;/em&gt;: I&lt;/li&gt;
&lt;li&gt;✅ &lt;em&gt;Final Output&lt;/em&gt;: &lt;code&gt;&amp;lt;start&amp;gt; how are you?&amp;lt;end&amp;gt; I&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;STEP 2&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Input&lt;/em&gt;: &lt;code&gt;&amp;lt;start&amp;gt; how are you?&amp;lt;end&amp;gt; I&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Probabilities&lt;/em&gt;: _am (80%), few (5%), good (4%)&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Chosen&lt;/em&gt;: _am&lt;/li&gt;
&lt;li&gt;✅ &lt;em&gt;Final Output&lt;/em&gt;: &lt;code&gt;&amp;lt;start&amp;gt; how are you?&amp;lt;end&amp;gt; I_am&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;STEP 3&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The process &lt;strong&gt;continues iteratively&lt;/strong&gt; 🔁
(until it reaches the &lt;code&gt;&amp;lt;end&amp;gt;&lt;/code&gt; token or finishes generating).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  📚 &lt;strong&gt;Extra Info&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;📜 The &lt;strong&gt;Transformer model&lt;/strong&gt; was introduced by Google in the research paper &lt;strong&gt;"Attention is All You Need"&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;🗣️ Originally created for &lt;strong&gt;Google Translate&lt;/strong&gt;, the idea was to understand and translate &lt;strong&gt;semantic meaning&lt;/strong&gt; using NLP.&lt;/li&gt;
&lt;li&gt;🧠 This means it was built with &lt;strong&gt;language understanding&lt;/strong&gt; in mind.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;🔍 &lt;strong&gt;GPT (by OpenAI)&lt;/strong&gt; was based on the transformer —&lt;br&gt;&lt;br&gt;
But instead of translation, it was built for &lt;strong&gt;next token prediction&lt;/strong&gt; 🧩&lt;/p&gt;
&lt;/blockquote&gt;




</description>
      <category>genai</category>
      <category>ai</category>
      <category>chaicode</category>
    </item>
  </channel>
</rss>
