<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Unknownerror-404</title>
    <description>The latest articles on Forem by Unknownerror-404 (@aniket_kuyate_15acc4e6587).</description>
    <link>https://forem.com/aniket_kuyate_15acc4e6587</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3367313%2Faed567dd-b818-4149-b8b8-2dd00483a822.png</url>
      <title>Forem: Unknownerror-404</title>
      <link>https://forem.com/aniket_kuyate_15acc4e6587</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/aniket_kuyate_15acc4e6587"/>
    <language>en</language>
    <item>
      <title>An update....</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Fri, 13 Mar 2026 17:21:18 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/an-update-3d37</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/an-update-3d37</guid>
      <description>&lt;p&gt;Hey there! Been a while. If you're new here, I'm your resident chatbot geek and classifier nerd. The reason for this blog: an update. Over the past couple of months, I've been working on a few projects (~3), which have consistently taken up most of my free time. So, this is just an update to let you know I have begun working on the next blog, and this time it isn't purely programmatic. It's slightly philosophical. With that, I'll leave you to it.&lt;br&gt;
Until Next Time!&lt;/p&gt;

</description>
      <category>devjournal</category>
    </item>
    <item>
      <title>From Understanding to Action: Teaching Your Assistant to Respond</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Fri, 20 Feb 2026 13:30:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/from-understanding-to-action-teaching-your-assistant-to-respond-4nfo</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/from-understanding-to-action-teaching-your-assistant-to-respond-4nfo</guid>
      <description>&lt;p&gt;NLU answers:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What did the user mean?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Dialogue management answers:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What should I do about it?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In Rasa, this logic is built using four core components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Domain files&lt;/li&gt;
&lt;li&gt;Stories&lt;/li&gt;
&lt;li&gt;Rules&lt;/li&gt;
&lt;li&gt;Slots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s break them down.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Domain File: Defining the Assistant’s World
&lt;/h4&gt;

&lt;p&gt;The domain file lives here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;domain.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This file defines everything your assistant knows how to do.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

intents:
  - greet
  - book_flight

entities:
  - location

slots:
  location:
    type: text
    mappings:
      - type: from_entity
        entity: location

responses:
  utter_greet:
    - text: "Hello! How can I help you?"

  utter_ask_location:
    - text: "Where would you like to travel?"

actions:
  - action_book_flight

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The domain defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What intents exist&lt;/li&gt;
&lt;li&gt;What entities exist&lt;/li&gt;
&lt;li&gt;What slots store&lt;/li&gt;
&lt;li&gt;What responses are available&lt;/li&gt;
&lt;li&gt;What custom actions can run
Think of it as the assistant’s capability registry; if it’s not in the domain, it doesn’t exist.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Slots: Memory Between Turns
&lt;/h4&gt;

&lt;p&gt;NLU understands a single message. Slots allow your assistant to remember information across messages. Slots are your assistant’s working memory; without them, every message is isolated.&lt;/p&gt;

&lt;h4&gt;
  
  
  Stories: Teaching Multi-Turn Behaviour
&lt;/h4&gt;

&lt;p&gt;Stories describe example conversations and live in stories.yml.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;stories:
  - story: book flight happy path
    steps:
      - intent: book_flight
      - action: utter_ask_location
      - intent: inform
        entities:
          - location: Madrid
      - action: action_book_flight

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stories show the dialogue model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Which intent starts a flow&lt;/li&gt;
&lt;li&gt;What action follows&lt;/li&gt;
&lt;li&gt;How slot filling changes behaviour&lt;/li&gt;
&lt;li&gt;When to execute business logic&lt;/li&gt;
&lt;li&gt;Under the hood, Rasa uses a transformer-based dialogue policy (TED policy) to learn patterns across these conversation examples.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Unlike rule systems, it generalises beyond exact story matches.&lt;/p&gt;

&lt;h4&gt;
  
  
  Rules: Deterministic Behaviour
&lt;/h4&gt;

&lt;p&gt;Sometimes you don’t want learning, and want certainty and rules live in rules.yml.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Examples:
rules:
  - rule: respond to greeting
    steps:
      - intent: greet
      - action: utter_greet

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rules are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deterministic&lt;/li&gt;
&lt;li&gt;One-intent → one-action&lt;/li&gt;
&lt;li&gt;Ideal for FAQs, greetings, confirmations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use rules for predictable behaviour and use stories for flows.&lt;/p&gt;

&lt;h4&gt;
  
  
  Common Dialogue Design Mistakes
&lt;/h4&gt;

&lt;p&gt;Just like NLU, dialogue design has pitfalls.&lt;br&gt;
Avoid:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Overusing rules for complex flows&lt;/li&gt;
&lt;li&gt;Writing too few stories&lt;/li&gt;
&lt;li&gt;Ignoring unhappy paths&lt;/li&gt;
&lt;li&gt;Forgetting slot resets&lt;/li&gt;
&lt;li&gt;Embedding business logic in responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And most importantly:&lt;br&gt;
Do not confuse intent prediction with behaviour control.&lt;/p&gt;

&lt;p&gt;Intent prediction tells you what the user wants, whereas dialogue management determines what happens next.&lt;/p&gt;

&lt;h4&gt;
  
  
  What You’ve Built So Far;
&lt;/h4&gt;

&lt;p&gt;Hopefully by now, you understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pipelines&lt;/li&gt;
&lt;li&gt;Rules&lt;/li&gt;
&lt;li&gt;Training&lt;/li&gt;
&lt;li&gt;Dialogue&lt;/li&gt;
&lt;li&gt;Model Behaviour
and the basics of rasa...&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Following this blog, the next one will probably be the last one to actually showcase a complete script to you. &lt;br&gt;
So you can practically understand all the concepts through examples.&lt;br&gt;
So, Until next time....&lt;/p&gt;

</description>
      <category>ai</category>
      <category>yaml</category>
      <category>chatbot</category>
      <category>rasa</category>
    </item>
    <item>
      <title>From DIET to Deployment: Training Your First Rasa NLU Model</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Mon, 16 Feb 2026 13:30:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/from-diet-to-deployment-training-your-first-rasa-nlu-model-3nod</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/from-diet-to-deployment-training-your-first-rasa-nlu-model-3nod</guid>
      <description>&lt;p&gt;CRF showed us structured entity extraction. DIET showed us joint intent–entity learning. Now it’s time to move from theory to practice.&lt;/p&gt;

&lt;p&gt;Understanding models is important. But models are useless without data, and this is where real NLU development begins.&lt;/p&gt;

&lt;p&gt;So far, we’ve discussed how DIET works internally.&lt;br&gt;
Now we answer the practical question: How do we actually train it?&lt;/p&gt;

&lt;p&gt;Rasa training consists of three core steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create structured training data&lt;/li&gt;
&lt;li&gt;Configure the NLU pipeline&lt;/li&gt;
&lt;li&gt;Train the model&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let’s walk through each.&lt;/p&gt;
&lt;h3&gt;
  
  
  Generating Training Data
&lt;/h3&gt;

&lt;p&gt;Rasa models learn entirely from annotated examples. Unlike rule-based systems, you don’t write logic, but you provide examples.&lt;/p&gt;
&lt;h4&gt;
  
  
  The NLU File
&lt;/h4&gt;

&lt;p&gt;Rasa training data lives inside a YAML file, typically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data/nlu.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

nlu:
  - intent: book_flight
    examples: |
      - Book a flight to [Paris](location)
      - I want to fly to [Berlin](location)
      - Get me a ticket to [London](location)

  - intent: greet
    examples: |
      - Hello
      - Hi
      - Hey there
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Intents are labels&lt;/li&gt;
&lt;li&gt;Entities are annotated inline&lt;/li&gt;
&lt;li&gt;No separate entity file&lt;/li&gt;
&lt;li&gt;No feature engineering
DIET learns both tasks from this single dataset.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  How Much Data Do You Need?
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;There’s no magic number, but for general guidance:&lt;/li&gt;
&lt;li&gt;10–15 examples per intent → minimum prototype&lt;/li&gt;
&lt;li&gt;50–100 examples per intent → production baseline&lt;/li&gt;
&lt;li&gt;Diverse phrasing is nearly always better than repetitive patterns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bad example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Book a flight to Paris&lt;/li&gt;
&lt;li&gt;Book a flight to Berlin&lt;/li&gt;
&lt;li&gt;Book a flight to London&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I need to travel to Paris&lt;/li&gt;
&lt;li&gt;Can you find flights to Berlin?&lt;/li&gt;
&lt;li&gt;Get me a ticket heading to London&lt;/li&gt;
&lt;li&gt;Fly me to Rome tomorrow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Variation teaches generalisation.&lt;/p&gt;

&lt;p&gt;We've already covered the configuration of the pipeline, so those curious can read the intermediate blogs from the playlist to understand how the configuration works.&lt;/p&gt;

&lt;h4&gt;
  
  
  Training the Model
&lt;/h4&gt;

&lt;p&gt;Once data and configuration are ready, training is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;rasa train&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Behind the scenes, Rasa reads NLU data using it to build vocabulary. This is followed by initialises DIET model and running multiple training epochs&lt;br&gt;
This is followed by optimising loss for intent + entity prediction, and those parameters are then saved as a trained model file.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;models/20260215-123456.tar.gz&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This folder contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;NLU model&lt;/li&gt;
&lt;li&gt;Dialogue model&lt;/li&gt;
&lt;li&gt;Metadata
Now your assistant is runnable.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  What Happens During Training?
&lt;/h4&gt;

&lt;p&gt;Internally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text is tokenised.&lt;/li&gt;
&lt;li&gt;Tokens are vectorised.&lt;/li&gt;
&lt;li&gt;Transformer layers process context.&lt;/li&gt;
&lt;li&gt;Intent and entity losses are computed jointly.&lt;/li&gt;
&lt;li&gt;Gradients update shared weights.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don’t manually tune features.&lt;/p&gt;

&lt;p&gt;Where the dev has to tune:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;epochs&lt;/li&gt;
&lt;li&gt;learning rate (advanced use)&lt;/li&gt;
&lt;li&gt;embedding dimensions&lt;/li&gt;
&lt;li&gt;batch size&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Testing the Model
&lt;/h4&gt;

&lt;p&gt;After training:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;rasa shell nlu&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Which activates a test server where you can truly experience the model yourself, asking prompts, testing limitations and forming improvements as you communicate.&lt;/p&gt;

&lt;p&gt;For any input you will obtain the outputs as follows: &lt;br&gt;
Say you type:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Book a flight to Madrid tomorrow&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can safely assume to obtain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "intent": {
    "name": "book_flight",
    "confidence": 0.94
  },
  "entities": [
    {
      "entity": "location",
      "value": "Madrid"
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is DIET in action, trained on your data.&lt;/p&gt;

&lt;h4&gt;
  
  
  Common Beginner Mistakes
&lt;/h4&gt;

&lt;p&gt;Now to address a few common yet harmful beginner errors:&lt;br&gt;
Improving data quality almost always improves performance more than tweaking architecture.&lt;/p&gt;

&lt;p&gt;So you should focus on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Diverse phrasing&lt;/li&gt;
&lt;li&gt;Balanced intents&lt;/li&gt;
&lt;li&gt;Clear entity boundaries&lt;/li&gt;
&lt;li&gt;Avoiding overlapping intent meanings&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Good data reduces ambiguity.&lt;/p&gt;

&lt;p&gt;And try avoiding:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Too few examples&lt;/li&gt;
&lt;li&gt;Overlapping intents&lt;/li&gt;
&lt;li&gt;Copy-paste variations&lt;/li&gt;
&lt;li&gt;Mixing business logic into NLU&lt;/li&gt;
&lt;li&gt;Ignoring real user phrasing&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Always remember:&lt;/p&gt;

&lt;p&gt;NLU predicts meaning, and it does not enforce workflow.&lt;br&gt;
And that the training works by following:&lt;br&gt;
Train → Test → Improve → Retrain.&lt;/p&gt;

&lt;h4&gt;
  
  
  Where We Go Next
&lt;/h4&gt;

&lt;p&gt;Now that we know:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How to generate training data&lt;/li&gt;
&lt;li&gt;How to configure DIET&lt;/li&gt;
&lt;li&gt;How to train a Rasa model&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, we’ll connect NLU to dialogue training:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Domain files&lt;/li&gt;
&lt;li&gt;Stories&lt;/li&gt;
&lt;li&gt;Rules&lt;/li&gt;
&lt;li&gt;Slot filling&lt;/li&gt;
&lt;li&gt;End-to-end training
Because predicting intent is only step one. Building behaviour is step two. Now we begin building real assistants.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Until next time.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>chatbot</category>
      <category>yaml</category>
      <category>rasa</category>
    </item>
    <item>
      <title>What is the DIETClassifier?</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Sun, 08 Feb 2026 01:29:37 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/what-is-the-dietclassifier-1n4j</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/what-is-the-dietclassifier-1n4j</guid>
      <description>&lt;p&gt;In the previous blog, we explored CRFEntityExtractor, a sequence-labeling model that learns how entities appear in context using statistical features.&lt;/p&gt;

&lt;p&gt;CRF represented a major step forward from pure rule-based extraction.&lt;br&gt;
But as conversational systems evolved, maintaining separate models for intent classification and entity extraction started to show its limits.&lt;/p&gt;

&lt;p&gt;Modern NLU pipelines favor shared representations, joint learning, and deep learning–based generalization.&lt;/p&gt;

&lt;p&gt;That’s where DIETClassifier comes in.&lt;/p&gt;

&lt;p&gt;Contents of this blog&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is DIETClassifier&lt;/li&gt;
&lt;li&gt;Why DIET was introduced&lt;/li&gt;
&lt;li&gt;How DIET works at a high level&lt;/li&gt;
&lt;li&gt;Intent classification with DIET&lt;/li&gt;
&lt;li&gt;Entity extraction with DIET&lt;/li&gt;
&lt;li&gt;Training data format&lt;/li&gt;
&lt;li&gt;When to use DIETClassifier&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  What is the DIETClassifier?
&lt;/h4&gt;

&lt;p&gt;DIET stands for Dual Intent and Entity Transformer.&lt;/p&gt;

&lt;p&gt;It is a single neural network that performs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Intent classification&lt;/li&gt;
&lt;li&gt;Entity extraction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;…at the same time.&lt;/p&gt;

&lt;p&gt;Unlike CRFEntityExtractor, which focuses only on entities, DIET jointly learns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The meaning of the full sentence (intent)&lt;/li&gt;
&lt;li&gt;The role of each token (entity labels)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This shared learning allows the model to use intent-level context to improve entity prediction, and vice versa.&lt;/p&gt;
&lt;h4&gt;
  
  
  Why was DIET introduced?
&lt;/h4&gt;

&lt;p&gt;Traditional pipelines looked like this:&lt;br&gt;
Intent classifier → predicts intent&lt;br&gt;
Entity extractor → predicts entities independently&lt;/p&gt;

&lt;p&gt;This separation has drawbacks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Duplicate feature computation&lt;/li&gt;
&lt;li&gt;No shared understanding between intent and entities&lt;/li&gt;
&lt;li&gt;More models to train, tune, and maintain&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;DIET solves this by using one model to learn shared embeddings and optimise both tasks together.&lt;/p&gt;

&lt;p&gt;This leads to better performance, especially when training data is limited.&lt;/p&gt;
&lt;h4&gt;
  
  
  How DIET works:
&lt;/h4&gt;

&lt;p&gt;DIET is based on a Transformer architecture.&lt;br&gt;
At a high level, it:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Tokenizes the input text&lt;br&gt;
Converts tokens into embeddings&lt;br&gt;
Applies transformer layers to model context&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;and predicts:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A sentence embedding → intent&lt;br&gt;
Token-level labels → entities&lt;br&gt;
Instead of hand-engineered features (as in CRF), DIET learns features automatically.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4&gt;
  
  
  Intent classification with DIET
&lt;/h4&gt;

&lt;p&gt;For intent classification, DIET:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Embeds the entire sentence&lt;/li&gt;
&lt;li&gt;Compares it against learned intent embeddings&lt;/li&gt;
&lt;li&gt;Uses similarity scoring to choose the best intent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Book a flight to Paris."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The model learns that this sentence embedding is closest to the book_flight intent. This approach allows DIET to generalize well to paraphrases and unseen phrasing.&lt;/p&gt;
&lt;h4&gt;
  
  
  Entity extraction with DIET
&lt;/h4&gt;

&lt;p&gt;For entities, DIET performs token-level classification, similar to CRF. Each token receives labels like B-entity, I-entity, O, etc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Book    O
a       O
flight  O
from    O
New     B-location
York    I-location
to      O
Paris   B-location
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference is that DIET uses contextual embeddings produced by transformers instead of manually designed features.&lt;/p&gt;

&lt;h4&gt;
  
  
  Training data format
&lt;/h4&gt;

&lt;p&gt;DIET uses the same annotated NLU data as CRF.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

nlu:
  - intent: book_flight
    examples: |
      - Book a flight from [New York](location) to [Paris](location)
      - Fly from [Berlin](location) to [London](location)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no separate configuration for intent vs entity training. DIET learns both from the same data.&lt;/p&gt;

&lt;h4&gt;
  
  
  Internal working (simplified)
&lt;/h4&gt;

&lt;p&gt;At runtime, DIET:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tokenizes the message&lt;/li&gt;
&lt;li&gt;Generates embeddings&lt;/li&gt;
&lt;li&gt;Applies transformer layers&lt;/li&gt;
&lt;li&gt;Predicts:

&lt;ul&gt;
&lt;li&gt;Intent with confidence&lt;/li&gt;
&lt;li&gt;Entity labels per token&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Groups entity tokens&lt;/li&gt;
&lt;li&gt;Outputs structured NLU results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "intent": {
    "name": "book_flight",
    "confidence": 0.92
  },
  "entities": [
    {
      "entity": "location",
      "value": "Paris",
      "start": 23,
      "end": 28
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  When should you use DIETClassifier?
&lt;/h4&gt;

&lt;p&gt;DIETClassifier is the default choice when you want a single model for intents and entities, when the language is flexible and conversational,&lt;br&gt;
and when you care about long-term scalability or are building production-grade assistants.&lt;/p&gt;

&lt;p&gt;CRFEntityExtractor and RegexEntityExtractor still have value, especially for highly structured or deterministic entities, but DIET is the backbone of modern Rasa NLU pipelines.&lt;/p&gt;

&lt;p&gt;With this, we have completed most of the major entity and intent mappers. Following this, we shall begin to see how bots are developed using code.&lt;/p&gt;

&lt;p&gt;Until next time.&lt;/p&gt;

</description>
      <category>chatbot</category>
      <category>rasa</category>
      <category>yaml</category>
      <category>ai</category>
    </item>
    <item>
      <title>Understanding CRFEntityExtractor: Learning Entities from Context</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Fri, 23 Jan 2026 13:46:39 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/understanding-crfentityextractor-learning-entities-from-context-2jp4</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/understanding-crfentityextractor-learning-entities-from-context-2jp4</guid>
      <description>&lt;p&gt;In the previous blog, we explored &lt;a href="https://dev.tourl"&gt;RegexEntityExtractor&lt;/a&gt;, a rule-based approach where entities are extracted by explicitly matching patterns.&lt;/p&gt;

&lt;p&gt;That works extremely well when entity formats are predictable.&lt;/p&gt;

&lt;p&gt;But not all entities behave that way.&lt;/p&gt;

&lt;p&gt;Some entities depend heavily on context, word boundaries, and surrounding tokens.&lt;br&gt;
This is where statistical learning becomes necessary.&lt;/p&gt;

&lt;p&gt;Enter the CRFEntityExtractor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contents of this blog&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is CRFEntityExtractor&lt;/li&gt;
&lt;li&gt;Why do we need it&lt;/li&gt;
&lt;li&gt;How CRF works at a high level&lt;/li&gt;
&lt;li&gt;Training data format&lt;/li&gt;
&lt;li&gt;Pipeline configuration&lt;/li&gt;
&lt;li&gt;Internal working&lt;/li&gt;
&lt;li&gt;Strengths and limitations&lt;/li&gt;
&lt;li&gt;When and why to use it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What is the CRFEntityExtractor?&lt;/strong&gt;&lt;br&gt;
The CRFEntityExtractor is a machine learning based entity extractor that uses a Conditional Random Field (CRF) model.&lt;/p&gt;

&lt;p&gt;Unlike regex-based extractors, it does not rely on fixed patterns.&lt;br&gt;
Instead, it learns how entities appear in context from labeled training data.&lt;/p&gt;

&lt;p&gt;In simple terms:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Given a sequence of tokens, the model learns which tokens belong to which entity types.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This allows it to extract entities even when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Formats vary&lt;/li&gt;
&lt;li&gt;Words are ambiguous&lt;/li&gt;
&lt;li&gt;Structure is loose&lt;/li&gt;
&lt;li&gt;Context determines meaning&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Why do we need it?&lt;/strong&gt;&lt;br&gt;
Many real-world entities are not strictly structured.&lt;/p&gt;

&lt;p&gt;Examples: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Person names&lt;/li&gt;
&lt;li&gt;Locations&lt;/li&gt;
&lt;li&gt;Job titles&lt;/li&gt;
&lt;li&gt;Product names&lt;/li&gt;
&lt;li&gt;Custom domain-specific terms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Consider the word “Apple”:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Buy Apple stock” → organization&lt;/li&gt;
&lt;li&gt;“Eat an apple” → food&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Regex cannot solve this.&lt;br&gt;
CRF can, because it looks at neighboring tokens, not just the token itself.&lt;/p&gt;

&lt;blockquote&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;How CRF works (high level)&lt;/strong&gt;&lt;br&gt;
CRF is a sequence labeling model.&lt;br&gt;
Instead of classifying individual tokens independently, it predicts the most likely sequence of labels for an entire sentence.&lt;/p&gt;

&lt;p&gt;Each token is assigned a label such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;B-entity (beginning)&lt;/li&gt;
&lt;li&gt;I-entity (inside)&lt;/li&gt;
&lt;li&gt;O (outside)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Book a flight from New York to Paris&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Token labels might look like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Book     O&lt;br&gt;
a        O&lt;br&gt;
flight   O&lt;br&gt;
from     O&lt;br&gt;
New      B-location&lt;br&gt;
York     I-location&lt;br&gt;
to       O&lt;br&gt;
Paris    B-location&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The CRF learns which label sequences are valid and likely, not just which individual labels fit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Training data format&lt;/strong&gt;&lt;br&gt;
CRFEntityExtractor requires annotated training data in your NLU YAML file.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

nlu:
  - intent: book_flight
    examples: |
      - Book a flight from [New York](location) to [Paris](location)
      - Fly from [Berlin](location) to [London](location)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From this data, the model learns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token patterns&lt;/li&gt;
&lt;li&gt;Contextual relationships&lt;/li&gt;
&lt;li&gt;Entity boundaries&lt;/li&gt;
&lt;li&gt;Transition probabilities between labels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More diverse examples generally lead to better generalization.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline configuration&lt;/strong&gt;&lt;br&gt;
To enable CRF-based extraction, add it to your pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pipeline:
  - name: WhitespaceTokenizer
  - name: LexicalSyntacticFeaturizer
  - name: CRFEntityExtractor

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key supporting components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tokenizer → splits text into tokens&lt;/li&gt;
&lt;li&gt;Featurizer → generates features such as:

&lt;ul&gt;
&lt;li&gt;Lowercase form&lt;/li&gt;
&lt;li&gt;Word shape&lt;/li&gt;
&lt;li&gt;Prefixes / suffixes&lt;/li&gt;
&lt;li&gt;Token position
CRF does not work directly on raw text, it works on features.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Internal Working&lt;/strong&gt;&lt;br&gt;
At runtime, the CRFEntityExtractor operates roughly as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tokenizes the user message&lt;/li&gt;
&lt;li&gt;Generates features for each token&lt;/li&gt;
&lt;li&gt;Applies the trained CRF model&lt;/li&gt;
&lt;li&gt;Predicts a label for every token&lt;/li&gt;
&lt;li&gt;Groups consecutive B- / I- labels into entities&lt;/li&gt;
&lt;li&gt;Outputs entities with:

&lt;ul&gt;
&lt;li&gt;Entity name&lt;/li&gt;
&lt;li&gt;Extracted value&lt;/li&gt;
&lt;li&gt;Start and end character indices
For the input:
&amp;gt; "I want to fly tomorrow"&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The extractor may output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "entity": "location",
  "value": "San Francisco",
  "start": 19,
  "end": 32
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The phrase is extracted not because it matches a pattern, but because the model learned that this sequence of tokens commonly forms a location.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When should CRFEntityExtractor be used?&lt;/strong&gt;&lt;br&gt;
CRFEntityExtractor is a good fit when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Entity boundaries depend on context&lt;/li&gt;
&lt;li&gt;Formats are inconsistent or unknown&lt;/li&gt;
&lt;li&gt;Natural language varies widely&lt;/li&gt;
&lt;li&gt;You want generalization rather than exact matching&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is often used alongside RegexEntityExtractor, not instead of it.&lt;br&gt;
Each extractor solves a different problem class.&lt;/p&gt;

&lt;p&gt;In the next blog, we’ll look at how DIETClassifier unifies intent classification and entity extraction, and why modern pipelines increasingly rely on it over standalone CRF models.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rasa</category>
      <category>chatbot</category>
      <category>yaml</category>
    </item>
    <item>
      <title>Understanding the RegexEntityExtractor in RASA</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Mon, 19 Jan 2026 12:00:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/understanding-the-regexentityextractor-in-rasa-4903</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/understanding-the-regexentityextractor-in-rasa-4903</guid>
      <description>&lt;p&gt;Our &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understanding-the-entity-synonym-mapper-in-rasa-56be"&gt;previous blog&lt;/a&gt; explored how the Entity Synonym Mapper helps normalize extracted entities into canonical values.&lt;/p&gt;

&lt;p&gt;Hereafter, we’ll move one step deeper into how entities are detected in the first place, specifically using pattern-based extraction.&lt;br&gt;
This is where the RegexEntityExtractor comes into play.&lt;/p&gt;
&lt;h2&gt;
  
  
  Contents of this blog
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;What is RegexEntityExtractor&lt;/li&gt;
&lt;li&gt;YAML configuration&lt;/li&gt;
&lt;li&gt;Internal working&lt;/li&gt;
&lt;li&gt;When and why to use it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What is the RegexEntityExtractor?&lt;/strong&gt;&lt;br&gt;
The RegexEntityExtractor is a rule-based entity extractor that uses regular expressions to identify entities in user input.&lt;br&gt;
Unlike ML-based extractors, it does not learn from data.&lt;br&gt;
Instead, it works on a very simple principle:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the text matches a predefined pattern, extract it as an entity.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This makes it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deterministic&lt;/li&gt;
&lt;li&gt;Fast&lt;/li&gt;
&lt;li&gt;Extremely precise (when patterns are well-defined)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why do we need it?&lt;/strong&gt;&lt;br&gt;
Not all entities are ambiguous.&lt;br&gt;
Some entities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Follow fixed formats&lt;/li&gt;
&lt;li&gt;Are numerical or structured&lt;/li&gt;
&lt;li&gt;Do not benefit from ML generalization&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Phone numbers&lt;/li&gt;
&lt;li&gt;Email addresses&lt;/li&gt;
&lt;li&gt;Order IDs&lt;/li&gt;
&lt;li&gt;Dates&lt;/li&gt;
&lt;li&gt;ZIP codes&lt;/li&gt;
&lt;li&gt;Trying to train an ML model to extract these is often overkill.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;YAML Configuration Example&lt;/strong&gt;&lt;br&gt;
Regex patterns are defined directly in your NLU YAML file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

nlu:
  - regex: phone_number
    examples: |
      - abc@gmail.com
      - xyz@gmail.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pipeline Configuration&lt;/strong&gt;&lt;br&gt;
To enable it, the extractor must be added to your pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pipeline:
  - name: WhitespaceTokenizer
  - name: RegexEntityExtractor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Internal Working&lt;/strong&gt;&lt;br&gt;
At a low level, the RegexEntityExtractor works as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Takes the raw user message&lt;/li&gt;
&lt;li&gt;Iterates over each regex pattern defined in YAML&lt;/li&gt;
&lt;li&gt;Applies the pattern to the text&lt;/li&gt;
&lt;li&gt;If a match is found:

&lt;ul&gt;
&lt;li&gt;Extracts the matched substring&lt;/li&gt;
&lt;li&gt;Assigns it as an entity&lt;/li&gt;
&lt;li&gt;Stores start and end character indices&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Consider the example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"My phone number is 9876543210"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the entity extracted is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "entity": "phone_number",
  "value": "9876543210",
  "start": 19,
  "end": 29
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Combining with Entity Synonym Mapper&lt;/strong&gt;&lt;br&gt;
A very common pattern is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;RegexEntityExtractor extracts the entity&lt;/li&gt;
&lt;li&gt;Entity Synonym Mapper normalizes it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This combination gives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Precision&lt;/li&gt;
&lt;li&gt;Consistency&lt;/li&gt;
&lt;li&gt;Clean downstream data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When should RegexEntityExtractor be used?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When Entity format is predictable&lt;/li&gt;
&lt;li&gt;When Precision matters more than recall&lt;/li&gt;
&lt;li&gt;When You want to reduce ML complexity&lt;/li&gt;
&lt;li&gt;When You want deterministic behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hereafter we’ll explore CRFEntityExtractor, where entities are learned statistically rather than matched explicitly.&lt;/p&gt;

</description>
      <category>yaml</category>
      <category>llm</category>
      <category>rasa</category>
      <category>chatbot</category>
    </item>
    <item>
      <title>Understanding the Entity Synonym Mapper in RASA</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Sat, 17 Jan 2026 13:25:14 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/understanding-the-entity-synonym-mapper-in-rasa-56be</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/understanding-the-entity-synonym-mapper-in-rasa-56be</guid>
      <description>&lt;p&gt;Our previous blog: &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understating-the-whitespace-tokenizers-2ic7"&gt;Understanding RASA pipelines&lt;/a&gt; &lt;br&gt;
described how RASA NLU handles stories, rules, policies, and forms.&lt;/p&gt;

&lt;p&gt;Hereafter, we'll dive deeper into how entities are normalized in RASA and how the Entity Synonym Mapper works, with YAML examples and practical insights for pipeline development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contents of this blog&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is the Entity Synonym Mapper?&lt;/li&gt;
&lt;li&gt;Why entity normalization is important&lt;/li&gt;
&lt;li&gt;YAML configuration example&lt;/li&gt;
&lt;li&gt;Internal working and considerations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What is the Entity Synonym Mapper?&lt;/strong&gt;&lt;br&gt;
As we discussed before, a pipeline is made up of modular components, each performing a small but important operation.&lt;/p&gt;

&lt;p&gt;The Entity Synonym Mapper is one such component in RASA NLU pipelines. Its primary role is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;To map different textual representations of the same concept to a canonical form so your model can treat them equivalently.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Think of it as a translator for your entities. For example, your users might type:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;"NYC"&lt;/li&gt;
&lt;li&gt;"New York City"&lt;/li&gt;
&lt;li&gt;"Big Apple"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All of these mean the same place, but without normalization, your chatbot would treat them as different entities. The Entity Synonym Mapper ensures that all of these map to a single canonical value, e.g., "New York City".&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why is this important?&lt;/strong&gt;&lt;br&gt;
Machine learning models, and NLP pipelines in general, cannot reason about synonyms automatically.&lt;/p&gt;

&lt;p&gt;Without normalization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Intent classification might succeed, but entity extraction will be inconsistent.&lt;/li&gt;
&lt;li&gt;Downstream processes, like database queries or API calls, may fail if the entity values are inconsistent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With normalization:&lt;br&gt;
"NYC" → "New York City"&lt;br&gt;
"Big Apple" → "New York City"&lt;/p&gt;

&lt;p&gt;This reduces variance, improves training efficiency, and ensures predictable behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;YAML Configuration Example&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Entity Synonym Mapper uses a YAML file to define the synonyms. Here’s a minimal example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

nlu:
  - intent: inform_city
    examples: |
      - I want to travel to [NYC](city)
      - I'm going to [Big Apple](city)
      - Book a hotel in [New York City](city)

  - synonym: New York City
    examples: |
      - NYC
      - Big Apple
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How this works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The intent section shows how users might express a concept in multiple ways.&lt;/li&gt;
&lt;li&gt;The synonym section defines the canonical value (New York City) and the variations that should be mapped to it (NYC, Big Apple).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Once defined, any entity recognized as one of the variations is automatically replaced by the canonical value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Internal Working&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At a low level, the Entity Synonym Mapper operates like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Entity extraction happens first (via your pipeline’s tokenizer + featurizer).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Mapper checks if the extracted entity matches any synonym entry in the YAML file.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If a match is found, the entity value is replaced with the canonical value.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of it as a dictionary lookup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;synonyms = {
    "NYC": "New York City",
    "Big Apple": "New York City"
}

entity = "NYC"
canonical_value = synonyms.get(entity, entity)
print(canonical_value)
# Output: New York City
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Practical Example&lt;/strong&gt;&lt;br&gt;
Imagine a chatbot for booking flights:&lt;/p&gt;

&lt;p&gt;User inputs:&lt;br&gt;
&lt;code&gt;"I want to fly to Big Apple next week"&lt;/code&gt;&lt;br&gt;
Without the Entity Synonym Mapper:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;{&lt;br&gt;
  "intent": "inform_city",&lt;br&gt;
  "entities": [{&lt;br&gt;
      "entity": "city",&lt;br&gt;
      "value": "Big Apple"}]&lt;br&gt;
 }&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With the Mapper:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;{&lt;br&gt;
  "intent": "inform_city",&lt;br&gt;
  "entities": [{&lt;br&gt;
      "entity": "city",&lt;br&gt;
      "value": "New York City"}]&lt;br&gt;
}&lt;br&gt;
Now your downstream logic, such as searching flight databases, always receives consistent entity values, eliminating errors.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When to Use the Entity Synonym Mapper:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When you have common abbreviations or nicknames in user input.&lt;/li&gt;
&lt;li&gt;When you want consistent entity values for downstream actions.&lt;/li&gt;
&lt;li&gt;When training on multiple intents that share the same entity concept but have different expressions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’ll explore RegexEntityExtractor, diving into pattern-based entity extraction and how it complements the Entity Synonym Mapper for robust NLU.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>chatbot</category>
      <category>yaml</category>
      <category>rasa</category>
    </item>
    <item>
      <title>Understanding the whitespace tokenizer!</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Thu, 08 Jan 2026 11:50:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/understating-the-whitespace-tokenizers-2ic7</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/understating-the-whitespace-tokenizers-2ic7</guid>
      <description>&lt;p&gt;Our previous blog: &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understanding-rasa-pipelines-gii"&gt;Understanding RASA pipelines&lt;/a&gt; describes how RASA NLU handles stories, rules, policies and forms. &lt;br&gt;
Here after, we'll dive deeper into how pipelines should be developed and how each pipeline may be developed.&lt;/p&gt;
&lt;h1&gt;
  
  
  Contents of this blog:
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Developing Pipelines&lt;/li&gt;
&lt;li&gt;WhitespaceTokenizer&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Developing Pipelines:
&lt;/h2&gt;

&lt;p&gt;As we discussed in the last blog, a pipeline is the basic architecture of any chatbot. These pipelines are built in a similar manner to functional coding or OOP, where the programmer effectively writes functions for specific operations which are then extended for further functional additions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def add(x, y):
    return x + y

def add_two_num(a, b):
    print(add(a, b))

if __name__ == "__main__":
    num1 = int(input("Provide the 1st num: "))
    num2 = int(input("Provide the 2nd num: "))
    add_two_num(num1, num2)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we develop a pipeline the basic considerations are, what do we want to achieve and is there a pre-existing package which does what we want already. &lt;/p&gt;

&lt;p&gt;If your answer is yes, it makes things very easy!&lt;/p&gt;

&lt;p&gt;The most basic resources for anyone working with RASA lies in its &lt;a href="https://rasa.com/docs" rel="noopener noreferrer"&gt;base documentation&lt;/a&gt;, the &lt;a href="https://github.com/RasaHQ/rasa" rel="noopener noreferrer"&gt;base repository&lt;/a&gt;, and their &lt;a href="https://rasa.com/docs/reference" rel="noopener noreferrer"&gt;API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Once we identify that are is a set of pipelines which could be useful for us, we begin effectively stacking one on top of the other building our functionality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example:
&lt;/h2&gt;

&lt;p&gt;Very recently I developed a bot for a clinic, as the answers required consistency and the quires could range from 'I need a vet' to 'Mind one for the animal doctor', RASA was the perfect fit.&lt;/p&gt;

&lt;p&gt;When I was working with RASA, I began by building architecture from a bottom-up approach, beginning by defining what how a word should be defined. &lt;/p&gt;

&lt;p&gt;This is where; we use the:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;WhitespaceTokenizer&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now even though we've heard of or explained the 'WhitespaceTokenizer' before this blog, I want to dive deep within the working of the module.&lt;/p&gt;

&lt;p&gt;It is the first step within the steps of RASA NLU pipelines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pipeline:
- name: WhitespaceTokenizer
  intent_tokenization_flag: true
  intent_split_symbol: "_"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The only purpose its servers is to break the sentences of users into 'tokens' it is not used for syntactical analysis, intent analysis, or even sentence normalisation. &lt;/p&gt;

&lt;p&gt;It is what decides where one 'token' is formed, now as redundant this maybe we consider tokens, not words.&lt;/p&gt;

&lt;p&gt;As ML models are unable to directly work with large string data, or rather raw text, they use tokens which are then converted to features and further into embeddings. Whitespacetokenizer is the simplest type, it only looks for whitespaces within a sentence and defines tokens.&lt;/p&gt;

&lt;h3&gt;
  
  
  Internal working:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Tokeinzation
&lt;/h4&gt;

&lt;p&gt;Consider a sentence, as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;'Hey? Can you direct me to the purchase page?'&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now the tokenizer works by dividing the sentences on whitespaces, and form a list as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;["Hey?", "Can", "you", "direct", "me", "to", "the", "purchase", "page?"]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The tokenizer does not remove any punctuations from sentences, this simple rule allows for a range of emotions to be captured through each input.&lt;/p&gt;

&lt;p&gt;As linguistically, Hey!, Hey?, Hey?(hesitant) or even Hey can have a multitude of different meanings which the model must capture to be precise. Whenever the module forms a singular token the information which is stored by it consists of the starting character number within the string, the ending character number and the message itself.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "text": "direct",
  "start": 14,
  "end": 19
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In terms of low-level code, one could map RASA string handling to how strings are terminated within C using '/0' or 'nullpointers' within Linked Lists.&lt;/p&gt;

&lt;p&gt;Rather than using it as its own, the Whitespace tokenizer is seen as a building block. Another similar tokenizer for periods is the RegexTokenizer. It too is consistently used within projects as but rather than working with sentenced tokens, it works with paragraphs and further divides them into sentences.&lt;/p&gt;

&lt;p&gt;Now that we have our building block placed down, here after we'll move to how sentences are considered syncatically.&lt;/p&gt;

&lt;p&gt;The next blog: &lt;a href="https://dev.tourl"&gt;To be released&lt;/a&gt;&lt;/p&gt;

</description>
      <category>yaml</category>
      <category>llm</category>
      <category>chatbot</category>
      <category>rasa</category>
    </item>
    <item>
      <title>Understanding RASA pipelines</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Tue, 06 Jan 2026 11:50:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/understanding-rasa-pipelines-gii</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/understanding-rasa-pipelines-gii</guid>
      <description>&lt;p&gt;Our previous blog: &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understanding-yaml-45ck"&gt;Understanding YAML&lt;/a&gt; describes how RASA NLU handles entities, intents and how slots are used within RASA.&lt;/p&gt;

&lt;p&gt;This blog will discuss the need and use of stories, rules, policies, and forms within a chatbot.&lt;/p&gt;

&lt;p&gt;Contents of this blog:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stories&lt;/li&gt;
&lt;li&gt;Rules&lt;/li&gt;
&lt;li&gt;Policies&lt;/li&gt;
&lt;li&gt;Forms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, What are stories?&lt;/p&gt;

&lt;h2&gt;
  
  
  Stories
&lt;/h2&gt;

&lt;p&gt;If intents describe what the user wants, entities describe the details, and slots describe what the assistant remembers, then stories describe how a conversation flows over time. In simple terms, stories teach RASA what should happen next. They are examples of conversations written from start to finish, showing how the assistant should respond given a sequence of user inputs, slot values, and actions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why stories exist&lt;/strong&gt;&lt;br&gt;
Unlike rule-based chatbots that follow rigid decision trees, RASA learns dialogue behaviour from examples. Stories provide those examples. Instead of explicitly coding:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;If the user says X, then do Y&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The programmer shows RASA:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;When conversations look like this, the assistant usually responds like that.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What a story contains?&lt;/strong&gt;&lt;br&gt;
A story is a sequence of: User intents, Optional entities and slot updates, and Assistant actions written chronologically.&lt;/p&gt;

&lt;p&gt;Essentially stories are training sets which train the bot on a set behaviour for some branch of the conversation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User says something
→ Bot responds
→ User provides more info
→ Bot reacts accordingly

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sequence is what RASA learns from.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic story structure&lt;/strong&gt;&lt;br&gt;
Stories are defined in stories.yml.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

stories:
- story: report symptom with duration
  steps:
  - intent: report_symptom
    entities:
    - symptom: fever
  - action: action_ask_duration
  - intent: provide_duration
    entities:
    - duration: three days
  - action: action_give_advice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each step represents one turn in the conversation. This provides the programmer the ability to be as nuanced or intentional as they want to be with their respective bot, and conversation direction.&lt;/p&gt;

&lt;p&gt;However, to be careful so that the bot doesn't respond to the unintended queries, we implement rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rules
&lt;/h2&gt;

&lt;p&gt;If stories teach RASA how conversations usually flow, rules define what must always happen. Rules are used when there is no room for ambiguity. They ensure that certain behaviors are deterministic, predictable, and enforced, regardless of context, wording, or conversation history.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why rules exist&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Machine learning is probabilistic by nature. That’s great for flexible conversations, but dangerous when a set condition is required to occur.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A goodbye should always end the conversation.&lt;/li&gt;
&lt;li&gt;A form must always ask missing information.&lt;/li&gt;
&lt;li&gt;An emergency symptom must always escalate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rules act as guardrails that override uncertainty within response selection, adding deterministic behaviour within the responses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What a rule contains&lt;/strong&gt;&lt;br&gt;
A rule describes two important properties of behaviour:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A condition (intent, slot, or active loop)&lt;/li&gt;
&lt;li&gt;A mandatory action that must follow&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Unlike stories which preserve possible probabilistic behaviour, rules do not branch, they do not generalise and are applied without any variation to them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic rule structure&lt;/strong&gt;&lt;br&gt;
A basic rule structure consists of the name of the rule, and the steps which are to be carried out by that rule.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

rules:
- rule: say goodbye
  steps:
  - intent: goodbye
  - action: utter_goodbye
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rules can also depend on slots or conversation state, hence certain conditions must be met surely in order for rules to be executed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- rule: emergency escalation
  condition:
  - slot_was_set:
    - emergency: true
  steps:
  - action: action_emergency_protocol
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When both rules and stories apply to a conversation, we follow a rule first order, hence if a rule exists, RASA will follow it even if a story suggests a different response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Policies&lt;/strong&gt;&lt;br&gt;
If intents and entities help RASA understand what the user said, and stories and rules describe how conversations should flow, then policies decide which action the assistant should take next.&lt;br&gt;
Policies are the decision-makers of RASA’s dialogue system. A policy is a strategy that RASA uses to predict the next action based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The current conversation state&lt;/li&gt;
&lt;li&gt;The intent detected&lt;/li&gt;
&lt;li&gt;Extracted entities&lt;/li&gt;
&lt;li&gt;Slot values&lt;/li&gt;
&lt;li&gt;Previous actions&lt;/li&gt;
&lt;li&gt;Active rules or forms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Multiple policies can exist at once, and RASA evaluates all of them before choosing the final action. Within the architecture of information processing, policies are located as after considering the conversation state.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User message
   |
   V
NLU (intent + entities)
   |
   V
Tracker (conversation state)
   |
   V
Policies evaluate state
   |
   V
Best next action chosen
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Policies operate after NLU and before response execution.&lt;/p&gt;

&lt;p&gt;Each policy:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Looks at the conversation tracker&lt;/li&gt;
&lt;li&gt;Predicts the next action&lt;/li&gt;
&lt;li&gt;Assigns a confidence score&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;RASA then selects the action with the highest confidence across all policies. A typical config.yml might look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;policies:
  - name: RulePolicy
  - name: MemoizationPolicy
  - name: TEDPolicy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;RulePolicy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The RulePolicy enforces rules. It checks if any rule applies, If yes → executes the rule-defined action it overrides all other policies. This guarantees deterministic behavior. If a rule matches, no ML prediction is needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MemoizationPolicy&lt;/strong&gt;&lt;br&gt;
Memoization is exact recall. If the current conversation state exactly matches a previously seen story, RASA repeats the same next action.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TEDPolicy&lt;/strong&gt;&lt;br&gt;
The TEDPolicy is RASA’s main ML-based dialogue policy. It embeds conversation states, learns patterns across stories, and generalises to unseen paths.&lt;/p&gt;

&lt;p&gt;TED allows the assistant to handle paraphrases, adapt to partial information, manage complex, and branching conversations.&lt;/p&gt;

&lt;p&gt;When it comes to how RASA NLU processes policies, it follows the conceptual order. So in our example, it would be RulePolicy -&amp;gt; deterministic manner, MemoizationPolicy -&amp;gt; Trained/seen data and follows it up using TEDPolicy to hand it off to ML processing.&lt;/p&gt;
&lt;h2&gt;
  
  
  Forms
&lt;/h2&gt;

&lt;p&gt;If intents tell RASA what the user wants, entities extract key information, and policies decide what to do next, then forms exist to systematically collect missing information.&lt;br&gt;
Forms are RASA’s way of saying:&lt;br&gt;
&lt;code&gt;I can’t proceed until I have everything I need.&lt;/code&gt; &lt;/p&gt;

&lt;p&gt;A form is a controlled dialogue mechanism used to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ask the user for required information&lt;/li&gt;
&lt;li&gt;Validate inputs&lt;/li&gt;
&lt;li&gt;Store values in slots&lt;/li&gt;
&lt;li&gt;Maintain conversational context until completion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They exist to handle free-flow conversation breaks down when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Multiple values are required&lt;/li&gt;
&lt;li&gt;Order matters&lt;/li&gt;
&lt;li&gt;Missing data blocks progress&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In our previous pipeline, forms act right after intent and entity consideration&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User message
   |
   V
Intent + Entities
   |
   V
Form activated
   |
   V
Ask for required slots
   |
   V
Validate inputs
   |
   V
Form deactivates

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Defining a form&lt;/strong&gt;&lt;br&gt;
Forms are declared in domain.yml in the following manner:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;forms:
  symptom_form:
    required_slots:
      - symptom
      - duration
      - severity

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a required slot is empty, RASA automatically asks for it when forms are activated in stories and rules.&lt;br&gt;
This covers up the basics of using RASA to build a chatbot, finally we will begin diving deeper into how the chatbot files play off each other and are used to, how policies themselves work, and intentional actions.&lt;/p&gt;

&lt;p&gt;The next blog: &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understating-the-whitespace-tokenizers-2ic7"&gt;To be released&lt;/a&gt;&lt;/p&gt;

</description>
      <category>yaml</category>
      <category>llm</category>
      <category>rasa</category>
      <category>chatbot</category>
    </item>
    <item>
      <title>Understanding YAML</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Sat, 03 Jan 2026 12:50:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/understanding-yaml-45ck</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/understanding-yaml-45ck</guid>
      <description>&lt;p&gt;Following up &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understanding-rasa-1h1c"&gt;Understanding RASA&lt;/a&gt; which discussed Featurizers and Classifiers, and Pipelines, this one will dive into Stories, Rules and Policies.&lt;/p&gt;

&lt;p&gt;Contents of this blog:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;YAML&lt;/li&gt;
&lt;li&gt;Intents&lt;/li&gt;
&lt;li&gt;Entities&lt;/li&gt;
&lt;li&gt;Slots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This blog will introduce readers to essential building blocks of yaml and RASA itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  YAML
&lt;/h2&gt;

&lt;p&gt;YAML or YML is a markup language, however unlike html or xml, yaml is used not for website design, but when using RASA, it acts as the structural foundation. Before we move on let's begin with understanding what yaml is and how yaml works.&lt;/p&gt;

&lt;p&gt;Yaml effectively is a coding language which works using indentations used to form type blocks. In yaml, a type is can be considered as a superclass which consists of all the subtypes of most commonly the particular class.&lt;/p&gt;

&lt;p&gt;The most common structure of a block is as provided below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Type:
    examples: |
      - e.g. 1
      - e.g. 2
      - e.g. 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure is most commonly used for defining intents, slots or entities when providing examples. &lt;/p&gt;

&lt;h2&gt;
  
  
  Intents
&lt;/h2&gt;

&lt;p&gt;Intents are essentially what a user aimed at saying from their message. Effectively what the user intended on saying from the message. Now even though RASA uses ML, it is utilised only by setting policies and creating policies. (If you don't know what those are, head on ahead to &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understanding-rasa-1h1c"&gt;understanding RASA&lt;/a&gt; as it is clearly established within the first blog.)&lt;/p&gt;

&lt;p&gt;It is an effective mapping tool which links all the similar meaning to a singular intent which conveys the broader response pattern.&lt;/p&gt;

&lt;p&gt;E.g:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I think I have a stomachache,
My stomach hurts,
I might be having abdominal pain.

Are all linked to the intent: intent_symptom_stomach_ache
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So an intent answers:&lt;br&gt;
What is the user trying to do or express?&lt;/p&gt;

&lt;p&gt;Within RASA itself, intents are the core unit of NLU (Natural Language Understanding). When considering a pipeline, we operate on text as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                         [Some input query]
                                  |
                                  V
          [RASA model predictions from intents and entities]
                                  |
                                  V
           [Dialogue manager manages which O/P to provide]
                                  |
                                  V
                           [Bot response]



*Pipeline representation using ASCII art.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Internally json handles these queries like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "intent": {
    "name": "report_symptom",
    "confidence": 0.92
  },
  "entities": [
    {"entity": "symptom", "value": "cough"},
    {"entity": "duration", "value": "two days"}
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here the intent tag defines which type of query was provided by the user and the confidence level provides how confident the module is within this prediction.&lt;/p&gt;

&lt;p&gt;In actuality when building chatbots, we would do it as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: "3.1"

nlu:
- intent: greet
  examples: |
    - hello
    - hi
    - good morning

- intent: report_symptom
  examples: |
    - I have a headache
    - My head hurts
    - I've been coughing for two days
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, intent is the group label for where they belong. These are defined under a file called domain.yml. This acts as the initialisation of the intent/group for user query.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;intents:&lt;/code&gt;&lt;br&gt;
  &lt;code&gt;- greet&lt;/code&gt;&lt;br&gt;
  &lt;code&gt;- goodbye&lt;/code&gt;&lt;br&gt;
  &lt;code&gt;- affirm&lt;/code&gt;&lt;br&gt;
  &lt;code&gt;- deny&lt;/code&gt;&lt;br&gt;
  &lt;code&gt;- mood_great&lt;/code&gt;&lt;br&gt;
  &lt;code&gt;- mood_unhappy&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Entity
&lt;/h2&gt;

&lt;p&gt;When the intent forms the intention behind the sentence, entity forms the specific value, information which the user aimed at finding information for. Without the specificity of entities, the information cannot be used for in-depth responses. Entities allow for procedural dynamism by utilizing branching.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Intent -&amp;gt; Symptom
Entities -&amp;gt; 
  symptomp = fever.
  duration = No. of days.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without these entities the bot isn't as intelligent as normal. RASA uses entities are extracted by NLU and passed to the dialogue manager. JSON handles these in a similar manner to intents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"entities": [
    {"entity": "symptom", "value": "fever"},
    {"entity": "duration", "value": "three days"}
  ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Entity definition occurs within the same file where we define the examples of intents i.e. 'nlu.yml'. However, the actual initialisation occurs within domain.yml in a similar manner.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- intent: report_symptom
  examples: |
    - I have a [fever](symptom)
    - I've been coughing for [two days](duration)
    - My [head](body_part) hurts


*within nlu.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Entities are of multiple types ranging from word, categorical, numerical, and lookup or as regex entities. Each time they are used together to take in as much information as possible. &lt;/p&gt;

&lt;h2&gt;
  
  
  Slots
&lt;/h2&gt;

&lt;p&gt;If intents answer what the user wants, and entities answer which specific information they provided, then slots answer what the assistant remembers.&lt;/p&gt;

&lt;p&gt;In simple terms, slots act as RASA’s memory system.&lt;/p&gt;

&lt;p&gt;While entities are extracted from a single user message, slots persist across multiple turns of conversation. This allows the chatbot to reason contextually instead of treating every user message as an isolated input.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why slots are needed&lt;/strong&gt;&lt;br&gt;
Consider the following interaction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: I have a fever.
Bot: How long have you had it?
User: Three days.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, the bot must remember that:&lt;br&gt;
“it” refers to fever&lt;br&gt;
the symptom has already been mentioned&lt;/p&gt;

&lt;p&gt;This continuity is made possible by slots. Without slots, the dialogue manager would not retain previous information, and the conversation would feel repetitive or incoherent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slot representation:&lt;/strong&gt;&lt;br&gt;
Extending the earlier pipeline representation, slots would act accordingly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                         [User input]
                               |
                               V
                [Intent &amp;amp; Entity extraction]
                               |
                               V
                 [Slot filling / slot update]
                               |
                               V
               [Dialogue manager (policies)]
                               |
                               V
                        [Bot response]


*extension of the pipeline from intents.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Slots sit between NLU and dialogue management, acting as state variables that influence which action or response is selected next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Defining slots&lt;/strong&gt;&lt;br&gt;
Slots are initialised inside domain.yml, similar to intents and entities.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;slots:
  symptom:
    type: text
    influence_conversation: true
  duration:
    type: text
    influence_conversation: true

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here:&lt;br&gt;
The type defines how the data is stored and influence_conversation determines whether the slot affects dialogue prediction for the current query.&lt;/p&gt;

&lt;p&gt;When an entity is extracted, RASA can automatically map it to a corresponding slot. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slots in JSON representation&lt;/strong&gt;&lt;br&gt;
Once filled, slots are stored internally as part of the conversation state.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"slots": {
  "symptom": "fever",
  "duration": "three days"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Slots can store different kinds of information depending on the use case.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text slots : store raw strings (e.g., symptoms, names)&lt;/li&gt;
&lt;li&gt;Categorical slots : restrict values to a predefined set (e.g., mild / moderate / severe)&lt;/li&gt;
&lt;li&gt;Boolean slots : true/false flags (e.g., emergency_present)&lt;/li&gt;
&lt;li&gt;Float / integer slots : numerical values such as age or dosage&lt;/li&gt;
&lt;li&gt;List slots : store multiple values (e.g., multiple symptoms)
are the most common types.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now that the most basic definitions have been established, well look into how this behaviour is handled by the predefined pipeline. &lt;/p&gt;

&lt;p&gt;The next blog: &lt;a href="https://dev.tourl"&gt;To be released&lt;/a&gt;&lt;/p&gt;

</description>
      <category>rasa</category>
      <category>yaml</category>
      <category>chatbot</category>
      <category>llm</category>
    </item>
    <item>
      <title>Understanding RASA</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Thu, 01 Jan 2026 12:00:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/understanding-rasa-1h1c</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/understanding-rasa-1h1c</guid>
      <description>&lt;p&gt;Previously, we understood the basics of Natural Language Processing ranging from sentence segmentation to parsing. These essential fundamentals form the foundation for understanding how systems work with and manipulate sentences. If you haven't read the blog, you can read it &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/the-predecessors-of-llms-understanding-chatbots-365i"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Moving forward we'll dive into understanding chatbots and building them using RASA.&lt;/p&gt;

&lt;p&gt;Contents of this blog:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Chatbots and development&lt;/li&gt;
&lt;li&gt;What is RASA?&lt;/li&gt;
&lt;li&gt;RASA core.&lt;/li&gt;
&lt;li&gt;Featurizers and Classifiers.&lt;/li&gt;
&lt;li&gt;Pipelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Definitions&lt;/strong&gt;&lt;br&gt;
Intents: An intent is a specific grouping of messages which the module can anticipate being used to map responses to.&lt;/p&gt;

&lt;p&gt;Classifiers: Classifiers essentially take features produced by featurizers and make predictions.&lt;/p&gt;

&lt;p&gt;Entity: An entity is a piece of information that the chatbot extracts from the user to perform some action.&lt;/p&gt;

&lt;p&gt;Slots: Slots are temporary variables used to hold data from conversations.&lt;/p&gt;
&lt;h2&gt;
  
  
  Chatbot and development
&lt;/h2&gt;

&lt;p&gt;Chatbots are basic response systems used for providing answers to queries. Although they are meant to be consistent, we can add dynamism by consider the methods by which they are developed. &lt;/p&gt;

&lt;p&gt;Effectively chatbot development consisted of linking specified answers to questions based on the user needs. Most commonly these were used as assistants within web services, however utility doesn't just end there. With the introduction of Artificial Intelligence development of chatbots has becoming scarcer, as people prefer building LLMs with neural networks the considerations of chatbot or chat models has recently dwindled. &lt;/p&gt;

&lt;p&gt;However, this doesn't make it ancient tech, rather a better understanding of preset response systems can help newer developers better understand the grounds for LLM development in general.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So how does one develop chatbots?&lt;/strong&gt;&lt;br&gt;
The development of chatbot varies a lot, for some a simple if-else structure leading to varying website responses can be a chatbot. For others a chatbot should be dynamic enough to be syntactically intelligent while also being consistent in its answering.&lt;/p&gt;

&lt;p&gt;Chatbot development can be divided into static type and dynamic type (very broadly) based on this user need.&lt;/p&gt;

&lt;p&gt;As previously stated, a simple if-else clause handling a 'y/n' response can be considered as partly a chatbot. Only in this case the answers are preloaded in the form of links or redirects to relevant pages. &lt;/p&gt;

&lt;p&gt;In recent years the web-dev scene has moved closer to adopting syntactical analysers and add a sense of dynamism while keeping consistency by using such logical iterators. These are the basis of website helpers or assistants, nearly intelligent yet partly logical.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query: Hey, Mind redirecting me to the Home page? I can't seem to find the link.
Assistant (internal): Hey, Mind redirecting me to the Home page? (Question)
I can't seem to find the link. (sentence)

Assistant (syntactical handling): ['Redirect'(action), 'me'(user), 'Home page'(location), 'can't find link' (reason)]

System response: https://Link_for_loc.org (some worded response)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This way the developer has complete control over the prompt responses which offers flexibility when developing large scalable websites. This reduces the 'black-box' from neural network training and add more transparency within the system itself. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is RASA?
&lt;/h2&gt;

&lt;p&gt;Now that we've established the need for response control with syntactical analysis one might question what RASA even is.&lt;/p&gt;

&lt;p&gt;It's exactly that, RASA is a python package providing syntactical intelligence to systems. This allows for an ML based input system which has the ability of understanding synonyms as well as complete sentence variation.&lt;/p&gt;

&lt;p&gt;For this the RASA module utilizes to core sub-modules which are responsible for handling this ability. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;RASA core&lt;/li&gt;
&lt;li&gt;RASA NLU &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For this blog's we'll set our focus on what the RASA core is and does. Specifically looking at Featurizers, Classifiers, Pipelines and Policies.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is RASA core?
&lt;/h2&gt;

&lt;p&gt;RASA core can be considered as the responder to the user queries. The queries are commonly handled by the NLU (Natural Language Understanding) engine, whereas the responses are managed by the core.&lt;/p&gt;

&lt;p&gt;RASA core is a state machine, i.e. it keeps track of the conversation, what the user intends from a sentence and finds the appropriate response for the query. Now we'll be covering NLU in the upcoming blog, I'll briefly explain how it works, as an understanding of intents is crucial for understanding response generation.&lt;/p&gt;

&lt;p&gt;Simply stated, the RASA NLU utilizes a file known as 'domain.yml'. Essentially it is a yaml file which is used to declare all the types of intents. Within the file, being headed as 'intents' all the relevant group titles are declared under it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;intents:
  - greet
  - goodbye
  - affirm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This essentially tells the model that there will be some queries which you may expect of the type 'greet' or 'goodbye'.&lt;/p&gt;

&lt;p&gt;Now the text is as is partly due to yaml coding, it uses python like indentations followed by dashes to declare subtype.&lt;/p&gt;

&lt;p&gt;These intents are then "initialised" similarly to how variable initialisation works, under the file named 'nlu.yml' we declare all the examples of what 'greet' may look like (if it is difficult understanding it, imagine a super class called greet which holds all methods: examples of greetings)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- intent: greet
  examples: |
    - hey
    - hello
    - hi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once all of these examples have been added we map these groups to their respective responses within a rules file, which maps the responses as steps which are to be performed once a query is asked.&lt;/p&gt;

&lt;p&gt;As we understand how the model understands and handles data inputs let's move onto the actual reason behind why it is able to understand varying degrees of similar sentence intentions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Features and Classifiers
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Featurizers&lt;/strong&gt;: Featurizers convert user messages (text) into numerical representations (features) that machine-learning models can understand.&lt;/p&gt;

&lt;p&gt;Effectively taken the parsed input and producing vector representation internally. This allows the model to understand patterns, capture the meaning and obtain a sense of 'What do you mean?'&lt;/p&gt;

&lt;p&gt;There are multiple featurizers but they are all dependent on the user's requirement.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Whitespace tokenizer: Used to form tokens, the mode of tokenization is every space between two words forms the definitions for where another token begins.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;within: stories.yml
language: en
pipeline:
- name: WhitespaceTokenizer
  intent_tokenization_flag: true
  intent_split_symbol: "_"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Regex Featurizer: In Rasa, the RegexFeaturizer is a lightweight feature extractor that adds binary features based on whether parts of a user message match predefined regular expression. It does not extract entities by itself, and it does not classify intents. Instead, it helps classifiers (like DIETClassifier) by giving them strong signals.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;within: stories.yml
- name: RegexFeaturizer

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Lexical Syntactic Featurizer: In Rasa, the Lexical Syntactic Featurizer (officially LexicalSyntacticFeaturizer) is a token-level featurizer that adds linguistic pattern features based on the form and position of each token in a sentence. It helps classifiers and entity extractors recognize structural patterns, not meaning.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;within: stories.yml
- name: LexicalSyntacticFeaturizer

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And many more!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Classifiers&lt;/strong&gt;: &lt;br&gt;
In Rasa, classifiers are machine-learning components that take the numeric features produced by featurizers and use them to predict labels from user messages.&lt;/p&gt;

&lt;p&gt;Those labels are mainly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Intents (what the user wants)&lt;/li&gt;
&lt;li&gt;Entities (important structured values in the text)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What a classifier does&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Receives features (sparse + dense) from featurizers&lt;/li&gt;
&lt;li&gt;Learns patterns from labeled training data&lt;/li&gt;
&lt;li&gt;Predicts labels for new user messages&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In short:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Text → Features → Classifier → Intent / Entities
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Some common classifiers are DIETClassifiers and Entity Synonym Mapper. These are essentially used to classify entities from the user queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DIETClassifiers&lt;/strong&gt;&lt;br&gt;
DIET stands for Dual Intent Entity Transformer classifier. It is used for both intent classification and entity extraction within a singular model in the place of using two varying models.&lt;/p&gt;

&lt;p&gt;Featurizers such as Regex (pattern), Lexical Syntactical (Wording), and Count vectors (vectorization) preprocess the initial inputs. With DIET receiving embeds as inputs transformer modules such as BERT uses self-attention. This allows it to learn contextual meaning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Entity Synonym Mapper&lt;/strong&gt;&lt;br&gt;
Used for normalising entity values. Used for synonymous words consideration for varying user inputs. This effectively translates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I need a heart doctor -&amp;gt; I need a cardiologist.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These changes can be spotted by the synonym mapper, but they are handled by slots and mapped within the 'domain.yml' file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;slots:
  patient_name:
    type: text
    mappings:
      - type: from_entity
        entity: patient_name
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pipelines&lt;/strong&gt;&lt;br&gt;
Pipelines are the architectural ordering which is used to process sentences. Pipelines are largely task specific, for this purpose RASA itself provides prebuilt packages of pipelines. These are mostly open-source projects, this promotes development of their own sequencing based on tasks, this is quite essential when developers prefer flexibility.&lt;/p&gt;

&lt;p&gt;For e.g.&lt;br&gt;
SpaCy Pipeline, Bert Pipeline or Bio-Bert Pipelines.&lt;/p&gt;

&lt;p&gt;SpaCy Pipeline:&lt;br&gt;
SpaCy is an NLP library with pre-trained embeddings for multiple languages. It provides embeddings with POS tagging with lemmatization and some NER.&lt;/p&gt;

&lt;blockquote&gt;
&lt;/blockquote&gt;

&lt;p&gt;Bert Pipeline: Provides a deep contextual embedding i.e. understands meaning from context. For RASA, the HF transformer NLP or DIET are integrated into the pipeline.&lt;/p&gt;

&lt;blockquote&gt;
&lt;/blockquote&gt;

&lt;p&gt;Bio-Bert: Domain specific version of BERT pretrained on biomedical text, this module is better for medical terminology. Accurate NER and useful for symptom checker based on disease names and drugs. This version is also useful in appointment scheduling for specified specialist.&lt;/p&gt;

&lt;p&gt;These modules basically form the backbone of response generation and information outputting. In the following blogs I'll dive deeper into the rules, policies and stories which will inform you how the rules for what to output are formed.&lt;/p&gt;

&lt;p&gt;Until next time!&lt;br&gt;
The next blog: &lt;a href="https://dev.tourl"&gt;To be decided&lt;/a&gt;&lt;/p&gt;

</description>
      <category>rasa</category>
      <category>yaml</category>
      <category>chatbot</category>
      <category>llm</category>
    </item>
    <item>
      <title>The predecessors of LLM's: Understanding Chatbots</title>
      <dc:creator>Unknownerror-404</dc:creator>
      <pubDate>Tue, 30 Dec 2025 12:00:00 +0000</pubDate>
      <link>https://forem.com/aniket_kuyate_15acc4e6587/the-predecessors-of-llms-understanding-chatbots-365i</link>
      <guid>https://forem.com/aniket_kuyate_15acc4e6587/the-predecessors-of-llms-understanding-chatbots-365i</guid>
      <description>&lt;p&gt;Contents of this blog:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sentence segmentation&lt;/li&gt;
&lt;li&gt;Tokenization&lt;/li&gt;
&lt;li&gt;POS tagging&lt;/li&gt;
&lt;li&gt;Parsing&lt;/li&gt;
&lt;li&gt;Named Entity Recognition&lt;/li&gt;
&lt;li&gt;Relation extraction&lt;/li&gt;
&lt;li&gt;Conversational Chatbots using RASA&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Natural Language Processing:
&lt;/h2&gt;

&lt;p&gt;For those unfamiliar with it, Natural Language Processing (NLP) can be described as the application of computational linguistics within computer science. While this definition captures the theory, its practical meaning is best understood through application.&lt;/p&gt;

&lt;p&gt;In practice, NLP involves building systems that can process and work with human language, ranging from analyzing sentence structure to generating appropriate responses based on that analysis, as seen in modern large language models (LLMs).&lt;/p&gt;

&lt;p&gt;However, generating meaningful responses requires a clear understanding of several foundational concepts, some theory, and a significant amount of practical experimentation.&lt;/p&gt;

&lt;p&gt;In this series, I aim to explore the process of building small-scale pretrained chatbots, beginning with rule and intent-based systems using RASA and YAML, and gradually progressing toward small-scale LLMs. So, let’s begin with the basics…&lt;/p&gt;

&lt;h2&gt;
  
  
  Sentence segmentation
&lt;/h2&gt;

&lt;p&gt;Sentence segmentation is the most essential and one of the earliest processing steps. Sentence Segmentation is used to track the start and end of within a given paragraph.&lt;br&gt;
For e.g.:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;It was nearly midnight. The Doctor was on his way out.
It was nearly midnight -&amp;gt; Sentence 1
the Doctor was on his way out. -&amp;gt; Sentence 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Accurate sentence segmentation is critical, as errors at this stage can propagate to downstream tasks such as parsing, named entity recognition, and information extraction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tokenization
&lt;/h2&gt;

&lt;p&gt;Tokenization is the process of dividing each sentence found by segmentation into smaller portions named "Tokens". These tokens are extracted in order of meaningful instances. So, essentially the sentence is divided into meaningful tokens holding essential structural information.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The doctor reviewed the patient’s chart.
Tokens: ["The", "doctor", "reviewed", "the", "patient", "’s", "chart", "."]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tokenization helps the model in considering what a word stands for in a given structure. Inaccurate tokenization can further propagate downwards leading to illogical wording patterns.&lt;/p&gt;

&lt;p&gt;Tokenization can be further sub-divided into three categories based on the requirement. Namely tokens can be formed on the basis of word extraction, Sub-word Tokenization, or Character Tokenization. Let's briefly understand them as some of these topics are currently used in modern NLP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Word Extraction&lt;/strong&gt;: Word Extraction Tokenization works as explained above, essentially extracting tokens to form words.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sub-word Tokenization&lt;/strong&gt;: Sub-word tokenization breaks words into smaller units to better handle rare, ambiguous, or previously unseen terms. Instead of relying on a fixed vocabulary of complete words, sub-word tokenizers decompose words into frequently occurring character sequences learned from training data.&lt;/p&gt;

&lt;p&gt;This approach allows lightweight or vocabulary-limited models to generalize effectively without treating unfamiliar words as entirely unknown.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;E.g.: 'Antibiotics' -&amp;gt; [Anti, Biotics] or [Anti, Bio, Tics]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These sub-word units are derived from statistical patterns rather than semantic meaning, and the exact split depends on the tokenization algorithm used (such as BPE or WordPiece).&lt;/p&gt;

&lt;p&gt;While sub-word tokenization is primarily an NLP technique, it can indirectly support text-to-speech (TTS) systems in integrated pipelines by enabling consistent handling of rare or complex words before phoneme or pronunciation modeling occurs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Character level tokenization&lt;/strong&gt;: Character-level tokenization is a more fine-grained approach in which text is decomposed into individual characters rather than words or sub-words.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;E.g.: 'Antibiotics' -&amp;gt; ['A', 'n', 't', 'i', 'b', 'i', 'o', 't', 'i', 'c', 's'].
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This method is useful for handling noisy input, spelling variations, and highly specialized terminology, though it often increases sequence length and computational cost. Character-level tokenization is typically used in niche applications or combined with higher-level tokenization strategies.&lt;/p&gt;

&lt;h2&gt;
  
  
  POS tagging
&lt;/h2&gt;

&lt;p&gt;POS tagging stands for Part of speech tagging, during this process, each token is labeled based on its own linguistic 'Part of speech'.&lt;br&gt;
Just like high school POS tagging just states:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Doctor -&amp;gt; Noun
screeched -&amp;gt; Verb
! -&amp;gt; Punctuation (Punct internally)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Parsing
&lt;/h2&gt;

&lt;p&gt;Although parsing is no longer a central architectural component in modern LLM-based chatbots, it remains a foundational concept that historically informed how linguistic structure is modeled in NLP systems. Parsing focuses on identifying how words within a sentence relate to one another through grammatical roles and dependencies.&lt;/p&gt;

&lt;p&gt;At its core, parsing assigns syntactic roles to words, allowing a sentence to be represented in a structured form. For e.g.:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Doctor -&amp;gt; Subject
treated -&amp;gt; Verb (POS determined)
the -&amp;gt; determiner
dog -&amp;gt; Object
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The aim with parsing is to perform a sort of syntactical analysis, parsing considers relating each word within the sentence and assigning it to a relation type, to be considered further for Entity Recognition by the Entity Recognizing Module.&lt;/p&gt;

&lt;h2&gt;
  
  
  Named Entity Recognition
&lt;/h2&gt;

&lt;p&gt;Entity Recognition Module (ERM) also known as Named Entity Recognition (NER) is the process of identifying named entities from the list provided after parsing, it is the most important module within the application as this module can be changed on the basis of the task i.e. it is task specific. On the basis of the NER module used, we obtain results as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Dr. XYZ -&amp;gt; Doctor
amoxicillin -&amp;gt; Medicine
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is necessary for contextualization of tokens, entity detection, and entity classification. Based on the requirements the module can be rule based, ML-based or Neural NERs. Each one is used effectively for simple, learned and complex applications respectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Relation extraction
&lt;/h2&gt;

&lt;p&gt;Relation Extraction (RE) is an NLP process that identifies and classifies meaningful relationships between entities detected in text. While Named Entity Recognition (NER) answers “what entities are present?”, relation extraction answers “how are these entities connected?”&lt;br&gt;
Relation extraction operates on text where entities have already been identified and determines the semantic relationship between them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Dr. XYZ prescribed amoxicillin to patient.
(amoxicillin given_to patient)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This way only the most important relationships are considered and mapped by the extractor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conversational Chatbots using RASA
&lt;/h2&gt;

&lt;p&gt;So, what does this information imply within chatbots? Effectively, nothing. Although this does give you a clear understanding of how computers handle sentences when trying to understand them. Now, when working with RASA, we would not be defining with any of this, but we still work with a few concepts such as intents, entities, relations, etc.&lt;/p&gt;

&lt;p&gt;In the blogs following this one, we'll dive deeper into how RASA develops chatbots, beginning with similar basics working all the way up to hopefully a working chatbot which you can converse with.&lt;/p&gt;

&lt;p&gt;So, until next time!&lt;br&gt;
The next blog: &lt;a href="https://dev.to/aniket_kuyate_15acc4e6587/understanding-rasa-1h1c"&gt;Understnading RASA&lt;/a&gt;&lt;/p&gt;

</description>
      <category>rasa</category>
      <category>llm</category>
      <category>ai</category>
      <category>chatbot</category>
    </item>
  </channel>
</rss>
