<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Noel Alex</title>
    <description>The latest articles on Forem by Noel Alex (@noel_alex_b235542c0cfc62a).</description>
    <link>https://forem.com/noel_alex_b235542c0cfc62a</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3544007%2F4a072db3-ff91-44e7-a070-bf683804a2e2.png</url>
      <title>Forem: Noel Alex</title>
      <link>https://forem.com/noel_alex_b235542c0cfc62a</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/noel_alex_b235542c0cfc62a"/>
    <language>en</language>
    <item>
      <title>Tired of AI Hallucinations? I Built a RAG App to Keep My Research Grounded.</title>
      <dc:creator>Noel Alex</dc:creator>
      <pubDate>Fri, 03 Oct 2025 16:49:28 +0000</pubDate>
      <link>https://forem.com/noel_alex_b235542c0cfc62a/tired-of-ai-hallucinations-i-built-a-rag-app-to-keep-my-research-grounded-40k3</link>
      <guid>https://forem.com/noel_alex_b235542c0cfc62a/tired-of-ai-hallucinations-i-built-a-rag-app-to-keep-my-research-grounded-40k3</guid>
      <description>&lt;p&gt;Hey everyone, I'm &lt;strong&gt;Noel Alex&lt;/strong&gt; from VIT Vellore! 👋&lt;/p&gt;

&lt;p&gt;Let's be real: we're all leaning on AI pretty heavily these days. Whether it's for debugging a stubborn piece of code or just exploring a new topic, LLMs have become our go-to. But there's a huge problem, especially when you're doing serious research.&lt;/p&gt;

&lt;p&gt;You ask a detailed question, and the AI gives you a beautifully written, confident-sounding answer that is... completely made up. It hallucinates. It invents facts, cites non-existent papers, and can send you down a rabbit hole of misinformation. For a developer or a student doing research, that's a nightmare.&lt;/p&gt;

&lt;p&gt;I ran into this exact wall while working on a research project. I needed answers I could trust, backed by actual, verifiable sources. I didn't want to "blindly trust AI"; I wanted to &lt;em&gt;use&lt;/em&gt; AI to augment my own intelligence, not replace my judgment.&lt;/p&gt;

&lt;p&gt;That’s when I decided to build my own solution: a &lt;strong&gt;Scientific Research Agent&lt;/strong&gt; that uses Retrieval-Augmented Generation (RAG) to give me answers grounded in reality.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Mission: AI Answers You Can Actually Trust
&lt;/h3&gt;

&lt;p&gt;The core idea behind RAG is simple but powerful: instead of letting an LLM pull answers from its vast, opaque training data, you give it a specific set of documents to use as its &lt;em&gt;only&lt;/em&gt; source of truth.&lt;/p&gt;

&lt;p&gt;The workflow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;You provide the knowledge:&lt;/strong&gt; Upload a bunch of trusted research papers.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;You ask a question:&lt;/strong&gt; "What are the latest findings on quantum entanglement?"&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The system retrieves:&lt;/strong&gt; It intelligently searches &lt;em&gt;only&lt;/em&gt; through your documents to find the most relevant paragraphs.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The AI synthesizes:&lt;/strong&gt; It takes those relevant snippets, and your question, and crafts an answer based &lt;em&gt;exclusively&lt;/em&gt; on that context.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No more hallucinations. No more made-up facts. Just pure, verifiable information synthesized into a coherent answer.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Tech Stack: Building the "Grounding Engine"
&lt;/h3&gt;

&lt;p&gt;I wanted this tool to be fast, efficient, and easy to use. Here’s the stack I chose to bring it to life:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;Streamlit&lt;/code&gt; for the UI:&lt;/strong&gt; I love Streamlit. It lets you build interactive web apps with just Python. No messy HTML or JavaScript needed. It was perfect for creating a simple interface for uploading files and asking questions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;llmware&lt;/code&gt; for the RAG Pipeline:&lt;/strong&gt; This library is a beast. It handled the entire backend RAG workflow seamlessly. It takes the uploaded PDFs, parses them, breaks them into smart chunks (way better than just splitting by a fixed number of characters), and then creates vector embeddings using a top-tier model like &lt;code&gt;jina-embeddings-v2&lt;/code&gt;. It basically builds the brain of my operation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;Groq&lt;/code&gt; for Blazing-Fast Inference:&lt;/strong&gt; This was the game-changer. RAG involves sending a lot of context to the LLM, which can be slow and expensive. Groq’s LPU™ Inference Engine is absurdly fast. I used the powerful &lt;code&gt;Llama-3.3-70B&lt;/code&gt; model, and it generates answers almost instantly. This speed makes the app feel responsive and genuinely useful, not a slow, clunky research tool.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Let's See the Code in Action
&lt;/h3&gt;

&lt;p&gt;The logic is surprisingly straightforward. Here's a high-level look at the Python script (&lt;code&gt;main.py&lt;/code&gt;):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;File Upload &amp;amp; Processing (Sidebar):&lt;/strong&gt;&lt;br&gt;
The Streamlit sidebar has a file uploader. When I hit "Process &amp;amp; Embed Documents," this function kicks in:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Simplified from the app
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_and_embed_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;library_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;folder_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;library&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Library&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;create_new_library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;library_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_folder_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;folder_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;install_new_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;embedding_model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;EMBEDDING_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;vector_db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chromadb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;&lt;code&gt;llmware&lt;/code&gt; takes care of creating a library, parsing the docs, and embedding them into a local &lt;code&gt;ChromaDB&lt;/code&gt; vector store. Easy peasy.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Asking a Question:&lt;/strong&gt;&lt;br&gt;
When a user types a query and hits "Get Answer," two things happen.&lt;/p&gt;

&lt;p&gt;First, we perform a semantic search to find relevant context:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Find the most relevant text chunks from the library
&lt;/span&gt;&lt;span class="n"&gt;query_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;library&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;semantic_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Second, we assemble a prompt with that context and send it to Groq:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Build the prompt with clear instructions
&lt;/span&gt;&lt;span class="n"&gt;prompt_template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Based *only* on the provided context, answer the query.
If the context does not contain the answer, say so.

Context:
{context}

Query:
{query}
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;---&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;query_results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;final_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Get the lightning-fast answer from Groq
&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;ask_groq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;final_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;LLM_MODEL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;The key here is the prompt: &lt;strong&gt;"Based &lt;em&gt;only&lt;/em&gt; on the provided context..."&lt;/strong&gt;. This is the instruction that constrains the LLM and prevents it from hallucinating.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Final Result: An AI I Can Finally Trust for Research
&lt;/h3&gt;

&lt;p&gt;What I ended up with is a personal research assistant that I can fully trust. I feed it the papers, and it gives me back synthesized knowledge from &lt;em&gt;those papers alone&lt;/em&gt;. I can see the exact context it used, so I can always verify the source.&lt;/p&gt;

&lt;p&gt;This project was a fantastic learning experience. It showed me that the real power of AI isn't just in its raw creative ability, but in our ability as developers to channel that power in a controlled, reliable, and useful way.&lt;/p&gt;

&lt;p&gt;So next time you're frustrated with a chatbot giving you nonsense, remember: you have the power to ground it in reality. Give RAG a try!&lt;/p&gt;

&lt;p&gt;You can check out the full code on my &lt;a href="https://github.com/Noel-Alex/Scientific-RAG" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Let me know what you think &lt;/p&gt;

</description>
      <category>showdev</category>
      <category>rag</category>
      <category>ai</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
