<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mihir Joshi</title>
    <description>The latest articles on Forem by Mihir Joshi (@mihirj).</description>
    <link>https://forem.com/mihirj</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3065302%2Fd8a1a300-9bd7-4d51-9bd2-8dc5b755cdc6.jpg</url>
      <title>Forem: Mihir Joshi</title>
      <link>https://forem.com/mihirj</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mihirj"/>
    <language>en</language>
    <item>
      <title>Building a Smart Café Menu Ordering Agent ☕🤖: Natural Language to Structured JSON with RAG</title>
      <dc:creator>Mihir Joshi</dc:creator>
      <pubDate>Sat, 19 Apr 2025 17:48:29 +0000</pubDate>
      <link>https://forem.com/mihirj/building-a-smart-cafe-menu-ordering-agent-natural-language-to-structured-json-with-rag-287a</link>
      <guid>https://forem.com/mihirj/building-a-smart-cafe-menu-ordering-agent-natural-language-to-structured-json-with-rag-287a</guid>
      <description>&lt;h2&gt;
  
  
  A Technical Deep Dive into Creating an Intelligent Interface
&lt;/h2&gt;

&lt;p&gt;Imagine walking into your favorite cafe and simply speaking your order: "Can I get a large oat milk latte and one blueberry muffin?". For a human barista, this is easy. But for an automated system, understanding this natural language request and translating it into a precise, machine-readable order (like a JSON object 🧾) is a complex technical challenge.&lt;/p&gt;

&lt;p&gt;This post dives into how we can build a Smart Café Menu Ordering Agent using modern GenAI techniques, specifically focusing on the Retrieval Augmented Generation (RAG) pattern. We'll explore the technical architecture behind converting free-text customer queries into structured output, making automated order processing a reality. This project, explored in a Kaggle notebook, integrates several key ML and GenAI components to create an intelligent interface.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Agent's Technical Workflow: From Query to JSON Order
&lt;/h2&gt;

&lt;p&gt;Let's break down the technical components that empower our Smart Café Agent, following the journey of a customer's natural language order from input to a structured JSON output.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Data Ingestion and Preparation 📊
&lt;/h3&gt;

&lt;p&gt;The agent needs to know the menu! The process starts with loading the menu data using &lt;strong&gt;pandas&lt;/strong&gt;. A crucial preprocessing step prepares the text for understanding: concatenating the item name, description, and category into a single text column. This consolidated string provides a rich textual representation for each menu item, essential for the next step.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Library:&lt;/strong&gt; pandas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Operation:&lt;/strong&gt; Data loading (&lt;code&gt;pd.read_csv&lt;/code&gt;), Feature Engineering (string concatenation for text column).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;menu_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;YOUR_CSV_FILE_PATH&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;menu_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;menu_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;item_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;menu_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; - category: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;menu_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This snippet shows the initial data loading into a pandas DataFrame and the creation of the combined text column used for embedding.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Semantic Understanding with Embeddings ✨
&lt;/h3&gt;

&lt;p&gt;To enable our agent to understand the customer's request and semantically match it to our menu, we convert all text into numerical vectors using &lt;strong&gt;Embeddings&lt;/strong&gt;. A pre-trained SentenceTransformer model, all-MiniLM-L6-v2 (chosen for efficiency in environments like Kaggle), generates these dense vectors for both the menu items and the customer's query. Items or queries with similar meanings will have vectors close to each other in a high-dimensional space.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Capability:&lt;/strong&gt; Embeddings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Libraries:&lt;/strong&gt;: sentence-transformers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model:&lt;/strong&gt; all-MiniLM-L6-v2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Operation:&lt;/strong&gt; Encoding text into vectors (&lt;code&gt;embedder.encode&lt;/code&gt;).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;

&lt;span class="c1"&gt;# Load embedding model (use small one for Kaggle)
&lt;/span&gt;
&lt;span class="n"&gt;embedder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Generate embeddings
&lt;/span&gt;
&lt;span class="n"&gt;menu_embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;menu_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;show_progress_bar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Assuming user_query is defined
&lt;/span&gt;
&lt;span class="c1"&gt;# query_embedding = embedder.encode([user_query])
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code demonstrates loading the specific embedding model and applying it to the menu text data to create searchable vectors.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Finding Relevant Items with Vector Search 🔍
&lt;/h3&gt;

&lt;p&gt;When a customer makes a request, the agent performs a &lt;strong&gt;Vector Search&lt;/strong&gt; to find the most relevant menu items. The embedding of the customer's query is compared to the embeddings of all menu items using cosine similarity. The items with the highest similarity scores are retrieved as potential matches. A simple in-memory index stores the menu embeddings for quick lookup.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Capability:&lt;/strong&gt; Vector Search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Libraries:&lt;/strong&gt; numpy, sklearn&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Operation:&lt;/strong&gt; Cosine Similarity Calculation (&lt;code&gt;cosine_similarity&lt;/code&gt;), Index Sorting (&lt;code&gt;np.argsort&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architectural Note:&lt;/strong&gt;  For production, this in-memory index would be replaced by a dedicated vector database for scalability and performance.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.metrics.pairwise&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cosine_similarity&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="c1"&gt;# Assuming query_vector and menu_vectors are prepared (query_vector from user query embedding, menu_vectors from menu_embeddings)
&lt;/span&gt;
&lt;span class="n"&gt;similarities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;cosine_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_vector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;menu_vectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Get index of the most similar item
&lt;/span&gt;
&lt;span class="n"&gt;top_match_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarities&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Optional: Get indices of top N matches
&lt;/span&gt;
&lt;span class="n"&gt;top_n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;top_n_indices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;argsort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similarities&lt;/span&gt;&lt;span class="p"&gt;)[::&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][:&lt;/span&gt;&lt;span class="n"&gt;top_n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This snippet shows the core calculation of cosine similarity between the query and menu embeddings to find the most semantically similar items.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Contextual Interpretation with RAG ✨🧠
&lt;/h3&gt;

&lt;p&gt;Here's where &lt;strong&gt;Retrieval Augmented Generation (RAG)&lt;/strong&gt; comes into play, providing Grounding for the LLM. Instead of letting the language model guess based on its general training data, we give it the specific details of the top-N relevant menu items found in the vector search. The agent constructs a prompt that includes the customer's original query and the formatted details of these retrieved items. This guides the LLM to interpret the request accurately based on the actual menu options.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Capability:&lt;/strong&gt; Retrieval Augmented Generation (RAG), Grounding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Operation:&lt;/strong&gt; Prompt Construction (incorporating retrieved text).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="c1"&gt;# Extract and format retrieved items to include in prompt
&lt;/span&gt;
&lt;span class="c1"&gt;# retrieved_items comes from selecting rows from menu_df based on top_n_indices
&lt;/span&gt;
&lt;span class="n"&gt;retrieved_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;item_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (Price: \$&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;retrieved_items&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Final prompt structure sent to the LLM
&lt;/span&gt;
&lt;span class="n"&gt;rag_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are an AI café assistant. A customer asked: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;

Here are some menu items that may be relevant:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;retrieved_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Based on this, generate a structured JSON order suggestion with fields:

- item
- quantity
- modifiers (if any)
- Price

Respond only with a JSON block. Do not include explanations or extra text.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These snippets illustrate how the retrieved menu details are formatted and then embedded within the structured prompt sent to the LLM, providing essential context....&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Generating Structured JSON Output 🧾
&lt;/h3&gt;

&lt;p&gt;The agent sends the RAG-augmented prompt to a powerful Large Language Model, &lt;strong&gt;gemini-2.0-flash&lt;/strong&gt;. A key requirement is getting a machine-readable order, so we configure the LLM for Structured Output (JSON mode). We explicitly request JSON using &lt;code&gt;response_mime_type="application/json"&lt;/code&gt; in the API call configuration and reinforce this in the prompt instructions. The LLM then processes the query and context to generate the order details in the specified JSON format.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Capability:&lt;/strong&gt; Structured Output / Controlled Generation, LLM Interaction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Library:&lt;/strong&gt; google-genai&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model:&lt;/strong&gt; gemini-2.0-flash&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Operation:&lt;/strong&gt; API Call (&lt;code&gt;client.models.generate_content&lt;/code&gt;), Configuration (&lt;code&gt;response_mime_type&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical Consideration:&lt;/strong&gt; While JSON mode is powerful, robust parsing and validation post-generation are still necessary due to the probabilistic nature of LLM outputs.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;

&lt;span class="c1"&gt;# Assuming client is initialized and rag_prompt is constructed
&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GenerateContentConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;response_mime_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rag_prompt&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# The model's JSON response is in response.text
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code shows the API call to the Gemini model, specifying the model, the crucial JSON output configuration, and the RAG prompt as input.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Post-processing and Validation ✅
&lt;/h3&gt;

&lt;p&gt;The final steps involve the agent parsing the LLM's JSON response using &lt;code&gt;json.loads&lt;/code&gt;. A custom &lt;code&gt;validate_order&lt;/code&gt; function then checks if the items suggested by the LLM actually exist on the menu (using a &lt;code&gt;menu_lookup&lt;/code&gt; dictionary created from the menu data) and verifies their availability. This ensures the suggested order is valid based on the current menu state. This entire sequence, from understanding the query to validating the output, constitutes the basic &lt;strong&gt;Agentic workflow&lt;/strong&gt; of our system.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Capability:&lt;/strong&gt; Basic Agentic Behavior&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Operation:&lt;/strong&gt; JSON Parsing (&lt;code&gt;json.loads&lt;/code&gt;), Data Validation (custom function, dictionary lookup).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="c1"&gt;# Assuming response is received from the LLM
&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="n"&gt;raw_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
\&lt;span class="c1"&gt;# ... parsing and normalization logic to get order_json ...
&lt;/span&gt;&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error decoding JSON. Raw response:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;order_json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;order&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]}&lt;/span&gt; \&lt;span class="c1"&gt;# Handle parsing errors gracefully
&lt;/span&gt;
&lt;span class="c1"&gt;# Assuming menu_lookup is created from menu_df metadata
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;menu_lookup&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
\&lt;span class="c1"&gt;# ... logic to check each item against menu_lookup ...
&lt;/span&gt;&lt;span class="k"&gt;pass&lt;/span&gt; \&lt;span class="c1"&gt;# Function definition snippet
&lt;/span&gt;
&lt;span class="c1"&gt;# validated_order = validate_order(order_json, menu_lookup)
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This snippet highlights the JSON parsing step with error handling and references the validation function, crucial for ensuring the LLM's output is usable and correct.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Building Smarter Interfaces 🚀
&lt;/h2&gt;

&lt;p&gt;This project demonstrates how combining embeddings for semantic search, RAG for contextual grounding, and LLMs with structured output capabilities allows us to build intelligent agents capable of understanding natural language and generating precise, machine-readable responses. The Smart Café Menu Ordering Agent is a practical example of this pattern, directly converting a customer's free-text request into a structured JSON order.&lt;/p&gt;

&lt;p&gt;This RAG pattern is a foundational technique in AI Engineering for creating robust natural language interfaces across diverse domains. Future enhancements could involve adding conversation memory for multi-turn orders, integrating with a production-grade vector database for larger menus, implementing more sophisticated validation rules, or building out the agent workflow using frameworks like LangChain or LangGraph for increased complexity and tool use.&lt;/p&gt;

&lt;p&gt;By mastering these components, you can unlock the potential to build smarter, more intuitive systems that bridge the gap between human language and automated processes.&lt;/p&gt;

&lt;p&gt;If you found this post and the notebook helpful, please consider giving the notebook an upvote on Kaggle!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:💡&lt;/strong&gt; You can find the full code for this project in my &lt;a href="https://www.kaggle.com/code/joshimihir/smart-caf-menu-ordering-agent-capstone-2025q1" rel="noopener noreferrer"&gt;Capstone Project on Kaggle&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>kaggle</category>
      <category>ai</category>
      <category>rag</category>
      <category>python</category>
    </item>
  </channel>
</rss>
