<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: LLMWare</title>
    <description>The latest articles on Forem by LLMWare (@llmware).</description>
    <link>https://forem.com/llmware</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F8208%2F4bf5768d-460d-460b-9ccc-a80499ca040e.png</url>
      <title>Forem: LLMWare</title>
      <link>https://forem.com/llmware</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/llmware"/>
    <language>en</language>
    <item>
      <title>How to Create a Local Chatbot Without Coding in Less Than 10 Minutes on AI PCs</title>
      <dc:creator>Rohan Sharma</dc:creator>
      <pubDate>Wed, 02 Jul 2025 04:10:07 +0000</pubDate>
      <link>https://forem.com/llmware/how-to-create-a-local-chatbot-without-coding-in-less-than-10-minutes-on-ai-pcs-2ajl</link>
      <guid>https://forem.com/llmware/how-to-create-a-local-chatbot-without-coding-in-less-than-10-minutes-on-ai-pcs-2ajl</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🔖 &lt;em&gt;No cloud. No internet. No coding.&lt;/em&gt; &lt;br&gt;
🔖 &lt;em&gt;Just you, your laptop, and 100+ powerful AI models running locally.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Imagine building your own chatbot that can answer your questions, summarize documents, analyze images, and even understand tables, all without needing an internet connection.&lt;/p&gt;

&lt;p&gt;Sounds futuristic?&lt;/p&gt;

&lt;p&gt;Thanks to &lt;strong&gt;Model HQ&lt;/strong&gt;, this is now a reality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model HQ&lt;/strong&gt; developed by &lt;a href="http://llmware.ai" rel="noopener noreferrer"&gt;LLMWare&lt;/a&gt;, is an innovative application that allows you to create and run a chatbot locally on your PC or laptop &lt;strong&gt;without an internet connection&lt;/strong&gt;. Best of all, this can be done with &lt;strong&gt;NO CODE&lt;/strong&gt; in &lt;strong&gt;less than 10 minutes&lt;/strong&gt;, even on older laptops up to 5 years old, provided they have 16GB or more of RAM.&lt;/p&gt;

&lt;p&gt;In this guide, we’ll walk you through how to create your own local chatbot using &lt;strong&gt;Model HQ&lt;/strong&gt; ; a revolutionary AI desktop app by &lt;a href="https://llmware.ai" rel="noopener noreferrer"&gt;LLMWare.ai&lt;/a&gt;. Whether you’re a student, developer, or a professional looking for a private and offline AI assistant, this tool puts the power of cutting-edge AI models &lt;strong&gt;directly on your laptop&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Let’s break it down.&lt;/p&gt;

&lt;p&gt;If you want to know about &lt;strong&gt;Model HQ in detail&lt;/strong&gt;, then read the blog below:&lt;br&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/llmware/how-to-run-ai-models-privately-on-your-ai-pc-with-model-hq-no-cloud-no-code-3o9k" class="crayons-story__hidden-navigation-link"&gt;How to Run AI Models Privately on Your AI PC with Model HQ; No Cloud, No Code&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;
          &lt;a class="crayons-logo crayons-logo--l" href="/llmware"&gt;
            &lt;img alt="LLMWare logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F8208%2F4bf5768d-460d-460b-9ccc-a80499ca040e.png" class="crayons-logo__image"&gt;
          &lt;/a&gt;

          &lt;a href="/rohan_sharma" class="crayons-avatar  crayons-avatar--s absolute -right-2 -bottom-2 border-solid border-2 border-base-inverted  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1936949%2Fa1fd5434-8c99-4531-9491-2d117d2e6996.jpg" alt="rohan_sharma profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/rohan_sharma" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Rohan Sharma
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Rohan Sharma
                &lt;a href="/++"&gt;&lt;img alt="Subscriber" class="subscription-icon" src="https://assets.dev.to/assets/subscription-icon-805dfa7ac7dd660f07ed8d654877270825b07a92a03841aa99a1093bd00431b2.png"&gt;&lt;/a&gt;
              
              &lt;div id="story-author-preview-content-2629400" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/rohan_sharma" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1936949%2Fa1fd5434-8c99-4531-9491-2d117d2e6996.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Rohan Sharma&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

            &lt;span&gt;
              &lt;span class="crayons-story__tertiary fw-normal"&gt; for &lt;/span&gt;&lt;a href="/llmware" class="crayons-story__secondary fw-medium"&gt;LLMWare&lt;/a&gt;
            &lt;/span&gt;
          &lt;/div&gt;
          &lt;a href="https://dev.to/llmware/how-to-run-ai-models-privately-on-your-ai-pc-with-model-hq-no-cloud-no-code-3o9k" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Jun 27 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/llmware/how-to-run-ai-models-privately-on-your-ai-pc-with-model-hq-no-cloud-no-code-3o9k" id="article-link-2629400"&gt;
          How to Run AI Models Privately on Your AI PC with Model HQ; No Cloud, No Code
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag crayons-tag--filled  " href="/t/showdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;showdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/security"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;security&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/nocode"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;nocode&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/llmware/how-to-run-ai-models-privately-on-your-ai-pc-with-model-hq-no-cloud-no-code-3o9k" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;80&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/llmware/how-to-run-ai-models-privately-on-your-ai-pc-with-model-hq-no-cloud-no-code-3o9k#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              17&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            5 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Download Model HQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Model HQ&lt;/strong&gt; is an AI desktop application that allows you to interact with over &lt;strong&gt;100+ top-performing AI models&lt;/strong&gt;, including large ones with up to &lt;strong&gt;32 billion parameters&lt;/strong&gt; — all running &lt;strong&gt;locally on your PC&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Unlike cloud-based tools, there’s &lt;strong&gt;no internet required&lt;/strong&gt;, and your data never leaves your machine. That means &lt;strong&gt;more privacy, better speed&lt;/strong&gt;, and zero cost for each query you run.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In this blog, we will be looking into the &lt;strong&gt;CHAT&lt;/strong&gt; feature of Model HQ that helps us to create a chatbot running locally on our machine.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;First, get the app.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://llmware-modelhq.checkoutpage.com/modelhq-client-app-for-windows" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;Download or Buy Model HQ for Windows&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Not ready to buy? No problem.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://llmware.ai/enterprise#developers-waitlist" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;Join the 90-Day Free Developer Trial&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once installed, you’ll have access to an interface that feels like your own AI control panel.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Choosing the Right AI Model
&lt;/h2&gt;

&lt;p&gt;Once installation is done, open the ModelHQ application, and then you will be prompted to add a setup method. The setup guide is provided after buying the application.&lt;/p&gt;

&lt;p&gt;After this, you will land in the main menu. Now, click on the Chat button.&lt;/p&gt;

&lt;p&gt;You’ll be prompted to select an AI model. If you’re unsure which model to choose, you can click on “choose for me,” and the application will select a suitable model based on your needs. Model HQ comes up with 100+ models.&lt;/p&gt;

&lt;h3&gt;
  
  
  Available Model Options:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Small Model&lt;/strong&gt;:&lt;br&gt;
~1– 3 billion parameters:- Fastest response time, suitable for basic chat.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Medium Model&lt;/strong&gt;:&lt;br&gt;
~7– 8 billion parameters:- Balanced performance, ideal for chat, data analysis, and standard RAG tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Large Model&lt;/strong&gt;:&lt;br&gt;
~9 – up to 32 billion parameters:- Most powerful chat, RAG, and best for advanced and complex analytical workloads.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the way, Model HQ will pick a smart default based on your system and use case.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The size of the model you choose can significantly impact both speed and output quality. &lt;strong&gt;Smaller models are faster but may provide less detailed responses&lt;/strong&gt;. Follow this simple rule:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvs5p0403z5nb9malw1g1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvs5p0403z5nb9malw1g1.png" alt="table"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3. Downloading Models
&lt;/h2&gt;

&lt;p&gt;For demonstration purposes, we are selecting the &lt;strong&gt;Small Model&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
If no models have been downloaded previously (e.g., in the &lt;strong&gt;No Setup&lt;/strong&gt;, &lt;strong&gt;Fast Setup&lt;/strong&gt;, or &lt;strong&gt;Full Setup&lt;/strong&gt; paths), the selected model will begin downloading automatically.&lt;br&gt;&lt;br&gt;
This process typically takes &lt;strong&gt;2–7 minutes&lt;/strong&gt;, depending on the model you selected and your internet speed. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is only a &lt;strong&gt;one-time internet requirement&lt;/strong&gt;; once the models are downloaded, you don’t need internet anymore.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt; &lt;/p&gt;
&lt;h2&gt;
  
  
  Step 4: Start Chatting
&lt;/h2&gt;

&lt;p&gt;Once you’ve selected a model, you can start a chat by typing in your questions. For example, you might ask a simple question like, “What are the top sites to see in Paris?” The model will generate a response based on its training data.&lt;/p&gt;
&lt;h3&gt;
  
  
  Customizing Your Chat Experience
&lt;/h3&gt;

&lt;p&gt;Model HQ allows you to customize your chat experience further. You can adjust settings such as the maximum output length and the randomness of the responses (known as temperature). By default, the app is set to generate up to 1,000 tokens, which is usually sufficient for smaller models. However, even if you’re using larger models, be cautious about increasing this limit, as it can consume more memory and take longer to generate responses. So, in short, you can adjust &lt;strong&gt;generation settings&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Max Tokens&lt;/strong&gt;: How long should the response be?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Temperature&lt;/strong&gt;: Should the answer be creative or precise?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Stop/Restart&lt;/strong&gt;: Hit ❌ to stop a long generation anytime.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;     &lt;/p&gt;
&lt;h2&gt;
  
  
  Step 5: Integrating Sources for Enhanced Responses
&lt;/h2&gt;

&lt;p&gt;One of the standout features of Model HQ is its ability to integrate sources, such as documents and images, into your chat. To do this, simply click on the “source” button and upload a file, such as a PDF or Word document.&lt;/p&gt;
&lt;h3&gt;
  
  
  Example: Using a Document as a Source
&lt;/h3&gt;

&lt;p&gt;For instance, if you upload an executive employment agreement, you can ask specific questions about the clauses within the document. The model will reference the uploaded document to provide accurate answers. This feature is invaluable for fact-checking and ensuring that you have the right information at your fingertips.&lt;/p&gt;
&lt;h3&gt;
  
  
  Chatting with Images
&lt;/h3&gt;

&lt;p&gt;Model HQ also allows you to chat with images. By uploading an image, the application can analyze the content and answer questions based on what it sees. This capability opens up a world of possibilities for multimedia processing, all done locally on your machine without any additional costs.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;
&lt;h2&gt;
  
  
  Step 6: Saving and Downloading Results
&lt;/h2&gt;

&lt;p&gt;After you’ve finished your session, you can save the chat results for future reference. This is particularly useful if you need to compile information for reports or presentations. Simply download the results, and you’ll have everything you need at your fingertips.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;
&lt;h2&gt;
  
  
  Step 7: Exploring Advanced Features
&lt;/h2&gt;

&lt;p&gt;As you become more comfortable with Model HQ, you can explore its advanced features. For example, you can experiment with different models to see how they perform with various types of queries. You can also adjust the generation settings to fine-tune the responses based on your specific needs.&lt;/p&gt;

&lt;p&gt;If you’re a visual learner, then watch this YouTube walkthrough:&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/6z3kyUpsGys"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Future Updates and Community Engagement
&lt;/h3&gt;

&lt;p&gt;Stay engaged with the Model HQ community by following their updates and tutorials on platforms like YouTube. The &lt;a href="https://youtube.com/playlist?list=PL1-dn33KwsmBiKZDobr9QT-4xI8bNJvIU&amp;amp;si=dLdhu0kMQWwgBwTE" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;Model HQ YouTube playlist&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt; offers valuable insights and tips to help you maximize your experience with the application.&lt;/p&gt;

&lt;p&gt;Join the &lt;a href="https://discord.gg/bphreFK4NJ" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;LLMWare’s Official Discord Server&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt; to interact with LLMWare’s great community of users and if you have any questions or feedback.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;Most AI apps require you to upload data to a cloud server. That’s slow, often expensive, and puts your privacy at risk.&lt;/p&gt;

&lt;p&gt;With &lt;strong&gt;Model HQ&lt;/strong&gt;, everything runs on your own machine with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;✅ No internet needed&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;✅ No Coding Required&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;✅ No API keys or credits&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;✅ No data leaves your PC&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;✅ Zero cost per query&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s &lt;strong&gt;your personal AI lab&lt;/strong&gt;, fully private and offline.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Get Started with Model HQ Today!
&lt;/h2&gt;

&lt;p&gt;Creating a chatbot that runs locally without coding and an internet connection has never been easier. With Model HQ, you have access to a powerful AI tool that can enhance your productivity and streamline your workflow. &lt;/p&gt;

&lt;p&gt;Ready to experience the future of AI? Visit the &lt;a href="https://llmware.ai" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;LLMWare website&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt; to learn more about Model HQ and its features. Don’t forget to sign up for the &lt;a href="https://llmware.ai/enterprise#developers-waitlist" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;90-day free trial for developers here&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt; and explore the application firsthand. When you’re ready to make the leap, you can &lt;a href="https://llmware-modelhq.checkoutpage.com/modelhq-client-app-for-windows" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;purchase Model HQ directly here&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Unlock the full potential of AI on your PC or laptop with Model HQ today, and take the first step towards creating your very own local chatbot!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>nocode</category>
      <category>security</category>
      <category>showdev</category>
    </item>
    <item>
      <title>How to Run AI Models Privately on Your AI PC with Model HQ; No Cloud, No Code</title>
      <dc:creator>Rohan Sharma</dc:creator>
      <pubDate>Fri, 27 Jun 2025 04:20:58 +0000</pubDate>
      <link>https://forem.com/llmware/how-to-run-ai-models-privately-on-your-ai-pc-with-model-hq-no-cloud-no-code-3o9k</link>
      <guid>https://forem.com/llmware/how-to-run-ai-models-privately-on-your-ai-pc-with-model-hq-no-cloud-no-code-3o9k</guid>
      <description>&lt;p&gt;In an era where efficiency and data privacy are paramount, &lt;strong&gt;Model HQ by&lt;/strong&gt; &lt;a href="https://llmware.ai" rel="noopener noreferrer"&gt;&lt;strong&gt;LLMWare&lt;/strong&gt;&lt;/a&gt; emerges as a game-changer for professionals and enthusiasts alike. Built by LLMWare, Model HQ is a groundbreaking desktop application that transforms your own PC or laptop into a fully private, high-performance AI workstation.&lt;/p&gt;

&lt;p&gt;Most AI tools rely on the cloud. &lt;strong&gt;Model HQ&lt;/strong&gt; doesn’t.&lt;/p&gt;

&lt;p&gt;No more cloud latency. No more vendor lock-in. Just &lt;strong&gt;100+ cutting-edge AI models&lt;/strong&gt;, blazing fast document search, and natural language tools; all running &lt;strong&gt;locally&lt;/strong&gt; on your machine.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  What is Model HQ?
&lt;/h2&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/Dbxb5qfsMaM"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model HQ&lt;/strong&gt; is a powerful, no-code desktop application that enables users to run enterprise-grade AI workflows &lt;strong&gt;locally&lt;/strong&gt;, &lt;strong&gt;securely&lt;/strong&gt;, and &lt;strong&gt;at scale,&lt;/strong&gt; right from their own PC or laptop. Designed for simplicity and performance, it provides point-and-click access to &lt;strong&gt;100+ state-of-the-art AI models&lt;/strong&gt;, ranging from &lt;strong&gt;1B to 32B parameters&lt;/strong&gt;, with built-in optimization for AI PCs and Intel hardware. Whether you’re building AI applications, analyzing documents, or querying data, Model HQ automatically adapts to your device’s specs to ensure &lt;strong&gt;fast, efficient inferencing,&lt;/strong&gt; even for large models that traditionally struggle on standard formats.&lt;/p&gt;

&lt;p&gt;What truly sets Model HQ apart is its &lt;strong&gt;privacy-first, offline capability&lt;/strong&gt;. Once models are downloaded, they can be used without Wi-Fi, keeping &lt;strong&gt;your data and sensitive information 100% on-device&lt;/strong&gt;. This makes it the fastest and most secure way to explore and deploy powerful AI tools without depending on the cloud or external APIs. From developers and researchers to enterprise teams, Model HQ delivers a &lt;strong&gt;seamless, cost-effective, and private AI experience&lt;/strong&gt;; all in one sleek, local platform.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  What Model HQ can do?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febpj6vk0qze2o2myp8qu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febpj6vk0qze2o2myp8qu.png" alt="model hq"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Chat:&lt;/strong&gt;&lt;br&gt;
The Chat feature allows users a fast way to start experimenting with chat models of various sizes, from Small (1–3 billion parameters), Medium (7–8 billion parameters) to Large (9 and above, up to 32 billion parameters).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Small Model&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
~1–3 billion parameters — Fastest response time, suitable for basic chat.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Medium Model&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
~7–8 billion parameters — Balanced performance, ideal for chat, data analysis and standard RAG tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Large Model&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
~9 up to 32 billion parameters — Most powerful chat, RAG, and best for advanced and complex analytical workloads.   &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://youtu.be/6z3kyUpsGys?si=4kYvkPEBUJN81nT6" rel="noopener noreferrer"&gt;Watch Chat in Action&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;2. Agents&lt;/strong&gt;&lt;br&gt;
Agents in Model HQ are pre-configured or custom-built workflows that automate complex tasks using local AI models. They allow users to process files, extract insights, or perform multi-step operations; &lt;strong&gt;all with point-and-click simplicity&lt;/strong&gt; and no coding required.&lt;/p&gt;

&lt;p&gt;Users can &lt;strong&gt;build new agents from scratch&lt;/strong&gt;, &lt;strong&gt;load existing ones&lt;/strong&gt; (either from built-in templates or previously created workflows), and manage them through a simple dropdown interface. From editing or deleting agents to running &lt;strong&gt;batch operations&lt;/strong&gt; on multiple documents, the Agent system provides a flexible way to scale private, on-device AI workflows. Pre-created agents include powerful tools like &lt;strong&gt;Contract Analyzer&lt;/strong&gt;, &lt;strong&gt;Customer Support Bot&lt;/strong&gt;, &lt;strong&gt;Financial Data Extractor&lt;/strong&gt;, &lt;strong&gt;Image Tagger&lt;/strong&gt;, and more — each designed to handle specific tasks efficiently.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/UTNQxspDi3I?si=yOaPilNSEqY1xLFy" rel="noopener noreferrer"&gt;Watch Agents in Action&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;3. Bots&lt;/strong&gt;&lt;br&gt;
The Bots feature allows users to create their own custom Chat and RAG bots seamlessly for either the AI PC/edge device use case (Fast Start Chatbot and Model HQ Biz Bot) or via API deployment (Model HQ API Server Biz Bot).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/uy53WKrMOXc?si=TAaS_hYj0AddXu2R" rel="noopener noreferrer"&gt;Watch Bots in Action&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;4. RAG&lt;/strong&gt;&lt;br&gt;
RAG combines retrieval-based techniques with generative AI to allow models to answer questions more accurately by retrieving relevant information from external sources or documents. With RAG in Model HQ, you can create knowledge bases that you can query in the chat section or via a custom bot by uploading documents. The RAG section is used only to create the knowledge base.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/FSjpAgIZnPM?si=5kMR_sXH_pCyNLvg" rel="noopener noreferrer"&gt;Watch Rag in Action&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;5. Models&lt;/strong&gt;&lt;br&gt;
The Models section allows you to explore, manage, and test models within Model HQ. You can discover new models, manage downloaded models, review inference history, and run benchmark tests; all from a single interface.&lt;/p&gt;

&lt;p&gt;And this all can be done, while keeping your &lt;strong&gt;data private, your workflows offline, and your AI performance fully optimized for your device&lt;/strong&gt; — no internet, no cloud, and no compromise. With its powerful features and user-friendly interface, Model HQ empowers you to leverage AI technology without compromising on security. Experience the future of AI today and transform the way you work!&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;
&lt;h2&gt;
  
  
  System Requirements
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1nqbehtpmqis291asyjc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1nqbehtpmqis291asyjc.png" alt="sys_req"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;
&lt;h3&gt;
  
  
  Experience Model HQ Risk-Free
&lt;/h3&gt;

&lt;p&gt;We understand that trying new software can be a leap of faith. That’s why we’re offering a &lt;a href="https://llmware.ai/enterprise#developers-waitlist" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;90-day free trial for developers&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt;. Experience the full capabilities of Model HQ without any commitment. Sign up for the trial here and discover how it can transform your workflow.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;
&lt;h3&gt;
  
  
  A Powerful Collaboration with Intel
&lt;/h3&gt;

&lt;p&gt;LLMWare.ai has partnered with Intel to optimize Model HQ for peak performance on your devices. This collaboration ensures that you receive a reliable and efficient AI experience, making your tasks smoother and more productive. Learn more about this exciting partnership &lt;a href="https://llmware.ai/intel" rel="noopener noreferrer"&gt;&lt;strong&gt;&lt;em&gt;here&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Read the Intel Solution Brief here:&lt;br&gt;
&lt;/p&gt;
&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://www.intel.com/content/www/us/en/content-details/854280/local-ai-no-code-more-secure-with-ai-pcs-and-the-private-cloud.html" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.intel.com%2Fetc.clientlibs%2Fsettings%2Fwcm%2Fdesigns%2Fintel%2Fus%2Fen%2Fimages%2Fresources%2Fprintlogo.png" height="auto" class="m-0"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://www.intel.com/content/www/us/en/content-details/854280/local-ai-no-code-more-secure-with-ai-pcs-and-the-private-cloud.html" rel="noopener noreferrer" class="c-link"&gt;
            Local AI—No Code, More Secure with AI PCs and the Private Cloud
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            Bring secure, no-code GenAI to your enterprise with Intel® AI PCs and LLMWare’s Model HQ—run agents and RAG queries locally without exposing data or incurring cloud costs.
In this brief, learn how to scale private AI simply and affordably.
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.intel.com%2Fetc.clientlibs%2Fsettings%2Fwcm%2Fdesigns%2Fintel%2Fdefault%2Fresources%2Ffavicon-32x32.png"&gt;
          intel.com
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Take the Next Step Towards AI Empowerment
&lt;/h3&gt;

&lt;p&gt;Don’t miss the chance to elevate your productivity with Model HQ. Whether you’re a business professional, a developer, or a student, this application is designed to meet your needs and exceed your expectations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://llmware-modelhq.checkoutpage.com/modelhq-client-app-for-windows" rel="noopener noreferrer"&gt;&lt;strong&gt;Purchase Model HQ Today!&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ready to unlock the full potential of AI on your PC or laptop? Buy Model HQ now by clicking here and take the first step towards a smarter, more efficient future.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h3&gt;
  
  
  Learn More About Model HQ
&lt;/h3&gt;

&lt;p&gt;For additional information about Model HQ, including detailed features and user guides, &lt;a href="https://llmware.ai" rel="noopener noreferrer"&gt;&lt;em&gt;visit our website&lt;/em&gt;&lt;/a&gt;. Don’t forget to check out our introductory video and explore our &lt;a href="https://youtube.com/playlist?list=PL1-dn33KwsmBiKZDobr9QT-4xI8bNJvIU&amp;amp;si=dLdhu0kMQWwgBwTE" rel="noopener noreferrer"&gt;&lt;em&gt;YouTube playlist&lt;/em&gt;&lt;/a&gt; for tutorials and tips.&lt;/p&gt;

&lt;p&gt;Join the &lt;a href="https://discord.gg/bphreFK4NJ" rel="noopener noreferrer"&gt;&lt;em&gt;LLMWare’s official Discord Server&lt;/em&gt;&lt;/a&gt; to interact with LLMWare's great community of users and if you have any questions or feedback.&lt;/p&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Model HQ&lt;/strong&gt; isn’t just another AI app, it’s a complete, offline-first platform built for &lt;strong&gt;speed, privacy, and control&lt;/strong&gt;. Whether you’re chatting with LLMs, building agents, analyzing documents, or deploying custom bots, everything runs &lt;strong&gt;securely on your own PC or laptop&lt;/strong&gt;. With support for models up to &lt;strong&gt;32B parameters&lt;/strong&gt;, RAG-enabled document search, natural language SQL, and no-code workflows, Model HQ brings enterprise-grade AI directly to your desktop, no cloud required.&lt;/p&gt;

&lt;p&gt;As the world moves toward AI-powered productivity, Model HQ ensures you’re ahead of the curve with a faster, safer, and smarter way to work.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>nocode</category>
      <category>showdev</category>
    </item>
    <item>
      <title>How I Learned Generative AI in Two Weeks (and You Can Too): Part 3 - Prompts &amp; Models</title>
      <dc:creator>Julia Zhou</dc:creator>
      <pubDate>Wed, 14 May 2025 12:05:49 +0000</pubDate>
      <link>https://forem.com/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-3-prompts-models-dd7</link>
      <guid>https://forem.com/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-3-prompts-models-dd7</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;It's been a few months since the last iteration in this series, but new year, more LLMWare Fast Start to RAG examples! In the previous articles, we covered creating libraries and transforming this information into embeddings. Now that we have done the heavy lifting, so to speak, we are ready to begin writing prompts and getting responses. This will be the focus of today's article.&lt;/p&gt;

&lt;h2&gt;
  
  
  Extra resources
&lt;/h2&gt;

&lt;p&gt;A few notes before we start! In case you missed them, I will link the previous articles in this series. This example will build upon &lt;a href="https://dev.to/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-1-libraries-215h"&gt;example 1&lt;/a&gt; and &lt;a href="https://dev.to/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-2-embeddings-2ppc"&gt;example 2&lt;/a&gt; and will assume prior understanding of these topics.  &lt;/p&gt;

&lt;p&gt;For visual learners, here is a video that works through example 3. Feel free to watch the video before following the steps in this article. Also, here is a Python Notebook that breaks down this example's code alongside the output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware/blob/cbed79d0c185ab2626a2c53fe20c262734d4e7f5/examples/Notebooks/fast_start_examples/example_3_prompts_and_models_version_1.ipynb" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Notebook for example 3: prompts and models&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/swiu4oBVfbA"&gt;
  &lt;/iframe&gt;
 &lt;/p&gt;
&lt;h2&gt;
  
  
  The code
&lt;/h2&gt;

&lt;p&gt;Now, we are ready to take a look at the example's code! This LLMWare Faststart example can be run in the same way as the previous ones, but instructions can be found in our &lt;a href="https://github.com/llmware-ai/llmware/blob/main/fast_start/README.md" rel="noopener noreferrer"&gt;README file&lt;/a&gt; if needed. Example 3 is directly copy-paste ready!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-3-prompts_and_models.py" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Code for example 3: prompts and models&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1 - What are prompts?
&lt;/h2&gt;

&lt;p&gt;While working through this example, I read a MIT Sloan Teaching &amp;amp; Learning Technologies article titled "Effective Prompts for AI: The Essentials". The entire article is definitely worth a read, but I wanted to share a quote that summarizes what &lt;strong&gt;prompts&lt;/strong&gt; are in the AI world. To read the whole article, check out &lt;a href="https://mitsloanedtech.mit.edu/ai/basics/effective-prompts/" rel="noopener noreferrer"&gt;this link&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Prompts are your input into the AI system to obtain specific results. In other words, prompts are conversation starters: what and how you tell something to the AI for it to respond in a way that generates useful responses for you ... It’s like having a conversation with another person, only in this case the conversation is text-based, and your interlocutor is AI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In other words, the prompt you provide determines how the AI responds. To create the most effective prompts, use specific wording and consider providing context, including in the form of additional text paragraphs. &lt;/p&gt;

&lt;h2&gt;
  
  
  Part 2 - Which model should I use?
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;model catalog&lt;/strong&gt; is a list of all the models LLMWare has registered. Like a dictionary, each model in the catalog is automatically linked with configuration data and implementation classes for easy use. The goal of this catalog is exactly this: ease of use. When provided with only the model's name, if it is present in the catalog, it should be able to run without any other information. &lt;/p&gt;

&lt;p&gt;The following lines of code provide lists of models included in the catalog. More information about the capabilities and performances of these models is included as comments in the Python code file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#   all generative models
llm_models = ModelCatalog().list_generative_models()

#   if you only want to see the local models
llm_local_models = ModelCatalog().list_generative_local_models()

#   to see only the open source models
llm_open_source_models = ModelCatalog().list_open_source_models()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The following line of code selections a model by index. To choose a different model, simply replace the index value.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model_name = gguf_generative_models[0]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatively, we can choose a specific model by name. For those interested in exploring RAG through &lt;strong&gt;OpenAI&lt;/strong&gt;, all of the LLMWare examples are ready to use. In this particular example, uncomment the following lines and insert the necessary information.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#   model_name = "gpt-4"
#   os.environ["USER_MANAGED_OPENAI_API_KEY"] = "&amp;lt;insert-your-openai-key&amp;gt;"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, these examples also encourage the use of &lt;strong&gt;open-source&lt;/strong&gt;, models. These are locally deployed models that produce top-notch quality right on your laptop. The developments in regards to open-source over the past few years cannot be overstated. The future of AI is here in these small, specialized models optimized for a specific purpose. &lt;/p&gt;

&lt;p&gt;For example, the LLMWare &lt;strong&gt;Bling 1B&lt;/strong&gt; is a small, fast model fine-tuned to RAG that runs on your local machine. &lt;/p&gt;

&lt;p&gt;To learn more about LLMWare's &lt;strong&gt;Bling&lt;/strong&gt; and &lt;strong&gt;Dragon&lt;/strong&gt; models, consider visiting their &lt;a href="https://huggingface.co/models?other=llmware-rag" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt; page!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia1.giphy.com%2Fmedia%2Fv1.Y2lkPTc5MGI3NjExbjR3NmsxcnR0am1jN3BxNzdoM3hqMGF1ZnU1cTZmOXF5amIxOGU1cSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw%2FHzPtbOKyBoBFsK4hyc%2Fgiphy.gif" class="article-body-image-wrapper"&gt;&lt;img width="50%" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia1.giphy.com%2Fmedia%2Fv1.Y2lkPTc5MGI3NjExbjR3NmsxcnR0am1jN3BxNzdoM3hqMGF1ZnU1cTZmOXF5amIxOGU1cSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw%2FHzPtbOKyBoBFsK4hyc%2Fgiphy.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 3 - Main example script
&lt;/h2&gt;

&lt;p&gt;Now, we can head to the main example script, &lt;code&gt;fast_start_prompting&lt;/code&gt;. We will follow four general steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pull sample questions&lt;/li&gt;
&lt;li&gt;Load the model&lt;/li&gt;
&lt;li&gt;Prompt the model&lt;/li&gt;
&lt;li&gt;Get results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The sample questions (each with query, answer, and context) are found at the top of the Python file. They cover a variety of fields with a little extra emphasis on business, financial, and legal applications. However, it is always encouraged to change these questions or add to them to better suit your interests and needs! All of the questions will be pulled in through the &lt;code&gt;test_list&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;To use the model, we create a &lt;strong&gt;prompt object&lt;/strong&gt;. Prompts are what we do to a model: we use them when we have a question in context and want to pass it to the model to receive a response. This line of code loads the model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;prompter = Prompt().load_model(model_name)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first time we load the model, it needs to "move" from the LLMWare Hugging Face repository to your local system, which can take a few minutes. However, once that is complete, all the work the model does will happen locally on your computer!&lt;/p&gt;

&lt;p&gt;Now, we loop through our list of questions. The key method &lt;code&gt;.prompt_main&lt;/code&gt; in the prompt class causes inference on the model. The  mandatory parameter for this method is the query. Optionally, &lt;code&gt;context&lt;/code&gt;, &lt;code&gt;prompt_name&lt;/code&gt;, and &lt;code&gt;temperature&lt;/code&gt; can also be passed in.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;output = prompter.prompt_main(entries["query"],
                                      context=entries["context"],
                                      prompt_name="default_with_context",
                                      temperature=0.30)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;context&lt;/strong&gt; is a passage of information we want the model to read before answering the question. This allows us to explain what we want the model to consider in its answer, and it will answer based on the passage. &lt;/p&gt;

&lt;p&gt;The prompt catalog supports a range of &lt;strong&gt;prompt names&lt;/strong&gt;. The code uses &lt;code&gt;default_with_context&lt;/code&gt;, which tells the model to read the provided context and answer the question. &lt;/p&gt;

&lt;p&gt;Adjusting the temperature will change the results of the query. In general, a lower temperature will yield more factual responses directly relating to the context. Higher temperatures are more appropriate when we require a more creative response from the model. For RAG based applications, we set the temperature comparatively low to yield the the most consistency and quality. &lt;/p&gt;

&lt;p&gt;The &lt;code&gt;output&lt;/code&gt; is a dictionary with two keys: &lt;code&gt;llm_response&lt;/code&gt; and &lt;code&gt;usage&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia1.giphy.com%2Fmedia%2Fv1.Y2lkPTc5MGI3NjExZWZ3cjk3ejJjZnd4Zzc1OGpsYTR2em5za3dyamY3cjluc2gxOHN5YyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw%2F8bE0EERrvXkq5S9BCa%2Fgiphy.gif" class="article-body-image-wrapper"&gt;&lt;img width="50%" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia1.giphy.com%2Fmedia%2Fv1.Y2lkPTc5MGI3NjExZWZ3cjk3ejJjZnd4Zzc1OGpsYTR2em5za3dyamY3cjluc2gxOHN5YyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw%2F8bE0EERrvXkq5S9BCa%2Fgiphy.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 4 - Running the model
&lt;/h2&gt;

&lt;p&gt;Once you run the code, you will see the queries being iterated through and printed out. Each of these print-outs has an &lt;strong&gt;LLM Response&lt;/strong&gt; and a &lt;strong&gt;Gold Answer&lt;/strong&gt;. The &lt;strong&gt;LLM Response&lt;/strong&gt; is the model's response while the &lt;strong&gt;Gold Answer&lt;/strong&gt; is an "answer key" we created that the model does not see. This allows us to quickly compare the two answers and check for the model's accuracy. &lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;LLM Usage&lt;/strong&gt; line provides additional information about how the model formulated its response. In particular, you can see the "processing_time" for each query, which showcases the model's speed. Of course, the computer you run the models on will also cause speed to vary - the amount of RAM available is especially impactful for efficiency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Query: What is the total amount of the invoice?
LLM Response: 22,500.00
Gold Answer: $22,500.00
LLM Usage: {'input': 209, 'output': 9, 'total': 218, 'metric': 'tokens', 'processing_time': 2.0669240951538086}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above output is a sample response. The LLM correctly responded to the query since its response matches the gold answer. &lt;/p&gt;

&lt;p&gt;We have successfully received answers to our questions! Congrats on reaching the end of this example. Here is a link to the full working code!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware/blob/cbed79d0c185ab2626a2c53fe20c262734d4e7f5/fast_start/rag/example-3-prompts_and_models.py" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;FULL CODE&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 5 - Further exploration
&lt;/h2&gt;

&lt;p&gt;To experiment more with this example, consider changing out the &lt;code&gt;model_name&lt;/code&gt; for other models! How does the LLMWare Bling model compare to the LLMWare Dragon model or OpenAI? Will these models generate the same response when provided the same queries and context? Once you try out these questions, let us know what you think!&lt;/p&gt;

&lt;p&gt;I hope you enjoyed this example about prompts and models! The next example will be about &lt;strong&gt;RAG text query&lt;/strong&gt;, stay tuned for the article. &lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia4.giphy.com%2Fmedia%2Fv1.Y2lkPTc5MGI3NjExZHZnd2pleXYzeHYzMTdpeDZtdnoxbWp0c2h2YmVhNm0ycXUzaWU2byZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw%2FAdX5dZF7LjigSMZZMm%2Fgiphy.gif" class="article-body-image-wrapper"&gt;&lt;img width="50%" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia4.giphy.com%2Fmedia%2Fv1.Y2lkPTc5MGI3NjExZHZnd2pleXYzeHYzMTdpeDZtdnoxbWp0c2h2YmVhNm0ycXUzaWU2byZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw%2FAdX5dZF7LjigSMZZMm%2Fgiphy.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  To see more ...
&lt;/h2&gt;

&lt;p&gt;Please join our LLMWare community on discord to learn more about RAG / LLMs and share your thoughts! &lt;a href="https://discord.gg/5mx42AGbHm" rel="noopener noreferrer"&gt;https://discord.gg/5mx42AGbHm&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Visit LLMWare's Website&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware" class="crayons-btn crayons-btn--primary" rel="noopener noreferrer"&gt;Explore LLMWare on GitHub&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.freepik.com/free-vector/personal-computer-screen-with-old-software-windows_36102472.htm#fromView=image_search_similar&amp;amp;page=1&amp;amp;position=49&amp;amp;uuid=6524a515-1b2e-441e-8993-357233bf186d&amp;amp;query=cute+coding" rel="noopener noreferrer"&gt;Image from Freepik&lt;/a&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>ai</category>
      <category>learning</category>
      <category>python</category>
    </item>
    <item>
      <title>How I Learned Generative AI in Two Weeks (and You Can Too): Part 2 - Embeddings</title>
      <dc:creator>Julia Zhou</dc:creator>
      <pubDate>Fri, 11 Oct 2024 11:41:08 +0000</pubDate>
      <link>https://forem.com/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-2-embeddings-2ppc</link>
      <guid>https://forem.com/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-2-embeddings-2ppc</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;A few weeks ago, I shared my experience learning about Generative AI Libraries through LLMWare's Fast Start to RAG example 1. Today, I will continue this series by taking you through example 2. This is personally one of my favorite "lessons" in this LLMWare series, so I hope you will find it thought-provoking as well! This example will focus on &lt;strong&gt;embeddings and vectors&lt;/strong&gt;. Let us start by exploring what exactly these terms mean! &lt;/p&gt;

&lt;h2&gt;
  
  
  How do embedding models work?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Embeddings models&lt;/strong&gt; are trained on large amounts of language tokens to either predict the next token or fill in missing tokens. In either case, these models learn how to represent language! They take in large chunks of text as input and processes it through tokenization (breaking down into smaller pieces), conversion into numbers, and various layers of transformations. These steps build a representation of the input text to help formulate the output: vectors. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vectors&lt;/strong&gt; are created when the input text is translated into the language through which the model sees the world. Geometrically speaking, they are n-dimensional shapes where "n" is the number of embedding dimensions (typically, n is 768). The dimensions are represented by n floats, usually ranging between 0 and 1 or -1 and 1. Converting the text to numbers allows the model to more easily compare the similarity of two texts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/i1xNYj8xdUmR7z3CrH/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/i1xNYj8xdUmR7z3CrH/giphy.gif" width="480" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Try thinking back to high school geometry! You might remember that two points (or shapes) that are close to each other are considered more similar to one another than two far away points. This process is exactly what the model performs to compare texts and is known as a &lt;strong&gt;semantic search&lt;/strong&gt;. Once a query is converted to a vector, that vector is compared to all the other vectors in the database. The ones that are the most similar are returned. &lt;/p&gt;

&lt;p&gt;Now, we are ready to take a look at the example's code! This LLMWare Faststart example can be run in the same way as example 1, but instructions can be found in our &lt;a href="https://github.com/llmware-ai/llmware/blob/main/fast_start/README.md" rel="noopener noreferrer"&gt;README file&lt;/a&gt; if needed. Example 2 is directly copy-paste ready!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware/blob/main/fast_start/rag/example-2-build_embeddings.py" class="ltag_cta ltag_cta--branded" rel="noopener noreferrer"&gt;Example 2: Embeddings&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Extra resources
&lt;/h2&gt;

&lt;p&gt;In case you missed it, I will link my previous article in this series since this example will continue building on the foundation we built in example 1. The same process for creating libraries is utilized in example 2, so I will skip over it here. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-1-libraries-215h" class="ltag_cta ltag_cta--branded"&gt;Article - Example 1: Libraries&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;For visual learners, here is a video that works through example 2. Feel free to watch the video before following the steps in this article. Also, here is a Python Notebook that breaks down this example's code alongside the output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks/fast_start_examples/example_2_build_embeddings_version_1.ipynb" class="ltag_cta ltag_cta--branded" rel="noopener noreferrer"&gt;Example 2 Notebook&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/2xDefZ4oBOM"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1 - Creating embeddings &amp;amp; storing vectors
&lt;/h2&gt;

&lt;p&gt;As mentioned above, we will not cover the library building process in this article and will move directly into embedding models. For this demo, we will use the "mini-lm-sbert" model, which is efficient and is included in the default LLMWare package. Feel free to experiment with different models, including the OpenAI Text Embedding Ada!&lt;/p&gt;

&lt;p&gt;Recall that in example 1, we not only created our library but also added our documents into a database. This database will make it extremely convenient to access test chunks that we can give to the embedding model. &lt;/p&gt;

&lt;p&gt;Once the library has been created, let us focus our attention on the most important line of code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;library.install_new_embedding(embedding_model_name=embedding_model, vector_db=vector_db,batch_size=100)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This line calls the &lt;code&gt;install_new_embedding&lt;/code&gt; function and passes in the embedding model and vector names as parameters. The final parameter &lt;code&gt;batch_size&lt;/code&gt; determines how many text chunks will be processed at a time. Considerations like efficiency, memory, model capability, and database size all factor into choosing the most appropriate batch size. &lt;/p&gt;

&lt;p&gt;We can confirm that our embedding creation and vector storage was a success!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;update = Status().get_embedding_status(library_name, embedding_model)
print("update: Embeddings Complete - Status() check at end of embedding - ", update)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Part 2 - Queries
&lt;/h2&gt;

&lt;p&gt;Now that we have the vector database, we can begin running queries on it! We will begin by creating a very simple query before passing it into the library and running a &lt;strong&gt;semantic query model&lt;/strong&gt; on it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sample_query = "incentive compensation"
query_results = Query(library).semantic_query(sample_query, result_count=20)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will use the following portion of code to iterate through the query results to view them, and we will especially look at the &lt;code&gt;distance&lt;/code&gt; parameter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for i, entries in enumerate(query_results):
  text = entries["text"]
  document_source = entries["file_source"]
  page_num = entries["page_num"]
  vector_distance = entries["distance"]

  if len(text) &amp;gt; 125: text = text[0:125] + " ... "

  print("\nupdate: query results - {} - document - {} - page num - {} distance - {} ".format(i, document_source, page_num, vector_distance))

  print("update: text sample - ", text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let us run the example to see the results in action!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/nFLW7PNGgN3lI68rdv/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/nFLW7PNGgN3lI68rdv/giphy.gif" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 3 - The results
&lt;/h2&gt;

&lt;p&gt;Through the output, we can see that at first, we have no embeddings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;embedding record - before embedding  [{'embedding_status': 'no', 'embedding_model': 'none', 'embedding_db': 'none', 'embedded_blocks': 0, 'embedding_dims': 0, 'time_stamp': 'NA'}]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, there are a series of outputs showing that we are creating embeddings in batches of 100, as expected. By the end, all of the text chunks will be converted to vectors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;update: Embeddings Complete - Status() check at end of embedding -  [{'_id': 2, 'key': 'example2_library_embedding_mini-lm-sbert', 'summary': '2211 of 2211 blocks', 'start_time': '1717690179.087806', 'end_time': '1717690199.5373614', 'total': 2211, 'current': 2211, 'units': 'blocks'}]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we have arrived back at the query result for-loop mentioned above. Looking at the first result, we can see that one, among many, of the outputted metadata points is distance. This distance value can be considered the distance between the vector for our query ("incentive compensation") and the vector for this sample block.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;update: query results - 0 - document - Artemis Poseidon EXECUTIVE EMPLOYMENT AGREEMENT.pdf - page num - 4 distance - 0.24837934970855713 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The query results are sorted from lowest to highest distance - that is, from most to least similar. For comparison, we can see that the tenth query result returned has a higher distance than the first one!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;update: query results - 10 - document - Eileithyia EXECUTIVE EMPLOYMENT AGREEMENT.pdf - page num - 3 distance - 0.27305811643600464 
update: text sample -  in Employer's annual cash incentive   bonus plan (the “Plan”), based on the same terms and conditions as in existence for oth ... 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Part 4 - Further exploration
&lt;/h2&gt;

&lt;p&gt;For this example, we used the "faiss" vector database, but I encourage you to experiment with others as well. &lt;/p&gt;

&lt;p&gt;Similarly, try using different embedding models to see how their  characteristics might be optimized for certain types of inputs! A series of examples involving embeddings can be found on the LLMWare Github page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware%20ai/llmware/tree/main/examples/Embedding" class="ltag_cta ltag_cta--branded" rel="noopener noreferrer"&gt;Embeddings Examples&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;I hope you enjoyed this example about embeddings and vectors! The next example will be about &lt;strong&gt;prompts and models&lt;/strong&gt;, stay tuned for the article. &lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

&lt;h2&gt;
  
  
  To see more ...
&lt;/h2&gt;

&lt;p&gt;Please join our LLMWare community on discord to learn more about RAG and LLMs! &lt;a href="https://discord.gg/5mx42AGbHm" rel="noopener noreferrer"&gt;https://discord.gg/5mx42AGbHm&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware" class="ltag_cta ltag_cta--branded" rel="noopener noreferrer"&gt;Visit LLMWare's Website&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware" class="ltag_cta ltag_cta--branded" rel="noopener noreferrer"&gt;Explore LLMWare on GitHub&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.freepik.com/free-photo/view-adorable-3d-cat_45138549.htm#fromView=search&amp;amp;page=1&amp;amp;position=1&amp;amp;uuid=c7c3603a-a846-4ddf-8a71-b0346612cef6" rel="noopener noreferrer"&gt;Image from Freepik&lt;/a&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>ai</category>
      <category>learning</category>
      <category>python</category>
    </item>
    <item>
      <title>How I Learned Generative AI in Two Weeks (and You Can Too): Part 1 - Libraries</title>
      <dc:creator>Julia Zhou</dc:creator>
      <pubDate>Thu, 12 Sep 2024 21:54:51 +0000</pubDate>
      <link>https://forem.com/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-1-libraries-215h</link>
      <guid>https://forem.com/llmware/how-i-learned-generative-ai-in-two-weeks-and-you-can-too-part-1-libraries-215h</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;For reference, prior to this journey, I barely had more knowledge about AI than the average person. Sure, I fired off the occasional ChatGPT request for one task or another, but I was always more focused on coding than AI, having picked up Python and Java during quarantine.  &lt;/p&gt;

&lt;p&gt;Despite my initial skepticism at being able to successfully understand the examples, particularly in a short time frame, I found LLMWare's "Fast Start to RAG" series highly accessible. I will cover example one of the course in this article - hopefully it can help you as well! If you are interested in learning more about LLMWare, feel free to check out our &lt;a href="https://llmware.ai/" rel="noopener noreferrer"&gt;website&lt;/a&gt; as well as another &lt;a href="https://dev.to/llmware/become-a-rag-professional-in-2024-go-from-beginner-to-expert-41mg"&gt;DEV article&lt;/a&gt; outlining the Fast Start to RAG examples. &lt;/p&gt;

&lt;p&gt;To clarify, extensive knowledge of coding, specifically Python 3, is not necessarily a prerequisite for the examples that I used to get my start in AI and RAG. However, basic understanding is certainly helpful in comprehending content and parsing code. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/CuuSHzuc0O166MRfjt/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/CuuSHzuc0O166MRfjt/giphy.gif" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;To run these examples, you will need to install the LLMWare package by running &lt;code&gt;pip3 install llmware&lt;/code&gt; in the command line. Further instructions can be found in our &lt;a href="https://github.com/llmware-ai/llmware/blob/main/fast_start/README.md" rel="noopener noreferrer"&gt;README file&lt;/a&gt;. Then, you will be able to run &lt;a href="https://github.com/llmware-ai/llmware/blob/main/fast_start/example-1-create_first_library.py" rel="noopener noreferrer"&gt;example 1&lt;/a&gt;, which is directly copy-paste ready.  &lt;/p&gt;

&lt;p&gt;I will also point out that the AI community tends to use acronyms (like AI itself!) and technical language extending beyond the scope of everyday conversation. The acronym "RAG" stands for Retrieval Augmented Generation, which enhances outputs of LLMs (Large Language Models) using external knowledge. In Example 1, we will be focusing on the first step in RAG - converting a pile of files into an AI ready knowledge base.    &lt;/p&gt;

&lt;h2&gt;
  
  
  Extra resources
&lt;/h2&gt;

&lt;p&gt;For visual learners, here is a video that works through example 1. Feel free to watch the video before following the steps in this article. Also, here is a Python Notebook that breaks down this example's code alongside the output: &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks/NoteBook_Examples/example_1_create_first_library.ipynb" rel="noopener noreferrer"&gt;Example 1 Notebook&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/2xDefZ4oBOM"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1 - Execution configuration
&lt;/h2&gt;

&lt;p&gt;By default, the active database being used is called "mongo", but we will select "sqlite" since it does not require a separate installation. &lt;/p&gt;

&lt;p&gt;Additionally, we can use different debug mode options to see more or less information as it is processed. We can set &lt;code&gt;debug_mode&lt;/code&gt; to 2 for more detailed outputs compared to 0, the default. &lt;/p&gt;

&lt;p&gt;For this example, sample data sets are imported through &lt;code&gt;from llmware.setup import Setup&lt;/code&gt; and are stored in &lt;code&gt;sample_folders&lt;/code&gt;. These sets include documents of different subject matters and sizes, but you will be able to replace them with your own data as well. We can choose a name for our library (go ahead and customize!) and select a folder from the samples before running the main script.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LLMWareConfig().set_active_db("sqlite")

LLMWareConfig().set_config("debug_mode", 2)

sample_folders = ["Agreements", "Invoices", "UN-Resolutions-500", "SmallLibrary", "FinDocs", "AgreementsLarge"]
library_name = "example1_library"
selected_folder = sample_folders[0]     # e.g., "Agreements"

output = parsing_documents_into_library(library_name, selected_folder)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://i.giphy.com/media/ua7vVw9awZKWwLSYpW/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/ua7vVw9awZKWwLSYpW/giphy.gif" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 2 - Main body
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Now, we can create our library! This line of code will set up the database tables as well as supporting file repositories to store information about the library.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;library = Library().create_new_library(library_name)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Steps 2 and 3:&lt;/strong&gt; However, our library is still completely empty, so we need to fill it up. To do so, we will load in the LLMWare sample files and save them in &lt;code&gt;sample_files_path&lt;/code&gt;. If you are using your own data sets, you will need to point to a local folder path with your documents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sample_files_path = Setup().load_sample_files(over_write=False)
ingestion_folder_path = os.path.join(sample_files_path, sample_folder)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; While adding files to a library, LLMWare performs parsing, text chunking, and indexing in the sqlite database. It will automatically choose the correct parser based on a file's extension type. This parser will extract information to store in database text chunks. Although this may seem like a lot of steps, it all happens incredibly quickly behind the scenes!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;parsing_output = library.add_files(ingestion_folder_path)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 5:&lt;/strong&gt; To check our progress, we can look at the &lt;code&gt;updated_library_card&lt;/code&gt;, which contains key metadata, counting data, and other important information. This &lt;code&gt;.get_library_card()&lt;/code&gt; method can be called at any time to retrieve information about your library,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;updated_library_card = library.get_library_card()
doc_count = updated_library_card["documents"]
block_count = updated_library_card["blocks"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Steps 6 and 7:&lt;/strong&gt; We can check the library's main folder structure, but the library is ready to start running queries! We will do this by instantiating a Query object and passing it to the library. This &lt;code&gt;test_query&lt;/code&gt; may need to be adjusted to best suit the data set. For this example, we chose the "Agreements" sample set, so we can use "base salary" as a "hello world"-esque query. &lt;/p&gt;

&lt;p&gt;Now, a text query is going to be run to look at every chunk of text to find the ones that contain "base salary" to return. The Query class contains many methods for different Query types. Today, we will use the simplest &lt;code&gt;text_query&lt;/code&gt; method.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query_results = Query(library).text_query(test_query, result_count=10)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can print out our results, giving us a look at the metadata and attributes of the individual text blocks we created!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;for i, result in enumerate(query_results):
        #   here are a few useful attributes
        text = result["text"]
        file_source = result["file_source"]
        page_number = result["page_num"]
        doc_id = result["doc_ID"]
        block_id = result["block_ID"]
        matches = result["matches"]

        print("query results: ", i, result)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://i.giphy.com/media/Z3VgQu8hkVeB1bakS9/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/Z3VgQu8hkVeB1bakS9/giphy.gif" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 3 - The results
&lt;/h2&gt;

&lt;p&gt;The outputted summary will include key information such as &lt;code&gt;total pdf files processed&lt;/code&gt;, &lt;code&gt;total blocks created&lt;/code&gt;, &lt;code&gt;total pages added&lt;/code&gt;, and &lt;code&gt;time elapsed&lt;/code&gt;. Try and see if you can find all of them! &lt;/p&gt;

&lt;p&gt;In particular, the LLMWare package includes "C based parsers" that are able to quickly and efficiently parse files. Once completed, the parsed information will be outputted as a dictionary. You will see the results of your work in the previous steps!&lt;/p&gt;

&lt;p&gt;To summarize, we took our documents and broke them down into thousands of blocks. Then, we extracted text information and put it into the sqlite database. Lastly, we ran a text search against that data to retrieve our results (including details as small as pixel coordinates and character level matches!).&lt;/p&gt;

&lt;p&gt;You just completed your first example, but there is so much more for you to explore! I would suggest rerunning this example with varied data sets to tap into the true potential of this technology, and of course, continue onto example 2 about &lt;a href="https://github.com/llmware-ai/llmware/blob/main/fast_start/example-2-build_embeddings.py" rel="noopener noreferrer"&gt;building embeddings&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 4 - To see more ...
&lt;/h2&gt;

&lt;p&gt;Please join our LLMWare community on discord to learn more about RAG and LLMs! &lt;a href="https://discord.gg/5mx42AGbHm" rel="noopener noreferrer"&gt;https://discord.gg/5mx42AGbHm&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware" class="ltag_cta ltag_cta--branded" rel="noopener noreferrer"&gt;Visit LLMWare's Website&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/llmware-ai/llmware" class="ltag_cta ltag_cta--branded" rel="noopener noreferrer"&gt;Explore LLMWare on GitHub&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.freepik.com/free-photo/computer-scientist-updating-ai-systems_237235999.htm#fromView=image_search_similar&amp;amp;page=1&amp;amp;position=3&amp;amp;uuid=0b4ac661-5087-4321-a5cf-0838108d5997" rel="noopener noreferrer"&gt;Image by DC Studio on Freepik&lt;/a&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>ai</category>
      <category>learning</category>
      <category>python</category>
    </item>
    <item>
      <title>Evaluating LLMs and Prompts with Electron UI 🤖 💬</title>
      <dc:creator>Will Taner</dc:creator>
      <pubDate>Wed, 07 Aug 2024 13:02:07 +0000</pubDate>
      <link>https://forem.com/llmware/evaluating-llms-and-prompts-with-electron-ui-4jkl</link>
      <guid>https://forem.com/llmware/evaluating-llms-and-prompts-with-electron-ui-4jkl</guid>
      <description>&lt;h2&gt;
  
  
  What is this UI useful for? 🤨
&lt;/h2&gt;

&lt;p&gt;LLMs are becoming an increasingly prevalent tool across various industries. However, achieving optimal results greatly depends on selecting the correct model and prompts. This process can be extremely time-consuming as it requires extensive trial and error.&lt;/p&gt;

&lt;p&gt;This article serves as a tutorial on a tool designed to streamline the process of testing various models and prompts. By using this tool, developers can efficiently identify the most effective combinations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/MvovQGsMBY9H2/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/MvovQGsMBY9H2/giphy.gif" alt="GIF" width="500" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Big shoutout to Kevin Brisson for creating this Electron UI! You may find the GitHub Repo for his tool here: &lt;a href="https://github.com/kbrisso/ai-base" rel="noopener noreferrer"&gt;https://github.com/kbrisso/ai-base&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The CLI commands provided in this article are designed for Linux/Mac systems and may not function correctly on Windows machines. If you encounter any issues, please replace the incompatible commands with their Windows equivalents.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step Guide on Getting the Tool up and Running!
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/l0IyjiXOXTX6Yemsg/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/l0IyjiXOXTX6Yemsg/giphy.gif" alt="GIF" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(If you would like to watch a video demonstrating the setup, click &lt;a href="https://youtu.be/5VM583r3JaM" rel="noopener noreferrer"&gt;here&lt;/a&gt;)&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Let's Start by Cloning the Repo 💻
&lt;/h3&gt;

&lt;p&gt;Navigate to a directory of your choosing and run &lt;code&gt;git clone https://github.com/kbrisso/ai-base&lt;/code&gt; in your command line.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Root Directory Installs 📦
&lt;/h3&gt;

&lt;p&gt;Navigate to the root directory, &lt;code&gt;ai-base&lt;/code&gt;, in your command line and run &lt;code&gt;npm install&lt;/code&gt;. If you experience an error, run &lt;code&gt;npm audit fix&lt;/code&gt;. This will resolve any vulnerabilities in the packages.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. llmware-wrapper Installs 📦
&lt;/h3&gt;

&lt;p&gt;Navigate to the directory &lt;code&gt;llmware-wrapper&lt;/code&gt; in your command line and run the same exact command from above: &lt;code&gt;npm install&lt;/code&gt; and &lt;code&gt;npm audit fix&lt;/code&gt; if there is an error.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Create a Virtual Environment and Install Packages 💾
&lt;/h3&gt;

&lt;p&gt;While in the same directory as above, &lt;code&gt;llmware-wrapper&lt;/code&gt;, create a new virtual environment: &lt;code&gt;python3 -m venv venv&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then enter into the virtual environment by running &lt;code&gt;source venv/bin/activate&lt;/code&gt; in the command line. &lt;/p&gt;

&lt;p&gt;Then run &lt;code&gt;pip install -r requirements.txt&lt;/code&gt; to install the required packages.&lt;/p&gt;

&lt;p&gt;Finally, deactivate the virtual environment by running &lt;code&gt;deactivate&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Copy Local Python Path 🐍
&lt;/h3&gt;

&lt;p&gt;On the file explorer on your IDE, open the file titled &lt;code&gt;llmware-wrapper.properties&lt;/code&gt; in the &lt;code&gt;llmware-wrapper&lt;/code&gt; directory.&lt;/p&gt;

&lt;p&gt;In this file, delete the path that the variable &lt;code&gt;pythonpath&lt;/code&gt; is already set to and replace it with the path to your local python interpreter.&lt;/p&gt;

&lt;p&gt;If you do not know what your local path is, run &lt;code&gt;which python&lt;/code&gt; in the command line then copy and paste the result where you deleted the previous path.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/o0QlbQONyHwBiBCBi3/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/o0QlbQONyHwBiBCBi3/giphy.gif" alt="GIF" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Let's start the UI! 🚀
&lt;/h2&gt;

&lt;p&gt;Navigate to the root directory, &lt;code&gt;ai-base&lt;/code&gt;, in your command line and run &lt;code&gt;npm start&lt;/code&gt;. After a few seconds a window of the UI should pop up.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmrk7dwysr80zzj82nfm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmrk7dwysr80zzj82nfm.png" alt="UI startup image" width="800" height="591"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Selecting a Model and Prompt 💡
&lt;/h2&gt;

&lt;p&gt;Click on the button that says "Choose a Model". This will display all the available models provided by LLMWare. Once you find a model that you would like to try, click the button that says "Choose" next to the model name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwd3ukfg7f2ybktljgt7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwd3ukfg7f2ybktljgt7.png" alt="models image" width="800" height="591"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click on the button that says "Choose a Prompt". This will display all the available types of prompts you can choose from provided by LLMWare. Additionally, you may find supplemental information about the prompt type, such as a description, to the right of the prompt name. Once you find a prompt type that you would like to try, click the button that says "Choose" next to the prompt name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7w2d6km9b6i475fybm1m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7w2d6km9b6i475fybm1m.png" alt="prompt image" width="800" height="591"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting it all together... 🧩
&lt;/h2&gt;

&lt;p&gt;After selecting a model and a prompt type, it is time to start querying! Simply add your query to the box labeled "Query" and click the button that says "Run Query". After some time, the response will show up in the box titled "Response".&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Depending on the prompt type chosen, an extra box for context will appear. You may use this space to provide relevant details for your query if needed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2kf25wz1p6ppvxe9c9l5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2kf25wz1p6ppvxe9c9l5.png" alt="example image" width="800" height="591"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Now you can experiment with different models and queries with ease!&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion 🏁
&lt;/h2&gt;

&lt;p&gt;Finding the right combination of models and prompts are crucial to creating a reliable and effective LLM tool. Utilizing this UI, you can find the best combination faster than ever before!&lt;/p&gt;

&lt;p&gt;Please check out our Github and leave a star! &lt;a href="https://github.com/llmware-ai/llmware" rel="noopener noreferrer"&gt;https://github.com/llmware-ai/llmware&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow us on Discord here: &lt;a href="https://discord.gg/MgRaZz2VAB" rel="noopener noreferrer"&gt;https://discord.gg/MgRaZz2VAB&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please be sure to visit our website &lt;a href="//llmware.ai"&gt;llmware.ai&lt;/a&gt; for more information and updates&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
    <item>
      <title>🤖Dueling AIs: Questioning and Answering with Language Models🚀</title>
      <dc:creator>Prashant Iyer</dc:creator>
      <pubDate>Sun, 28 Jul 2024 20:36:55 +0000</pubDate>
      <link>https://forem.com/llmware/dueling-ais-questioning-and-answering-with-language-models-5f0l</link>
      <guid>https://forem.com/llmware/dueling-ais-questioning-and-answering-with-language-models-5f0l</guid>
      <description>&lt;p&gt;You've probably asked a question to a &lt;em&gt;language model&lt;/em&gt; before and then had it give you an answer. After all, this is what we most commonly use language models for.&lt;/p&gt;

&lt;p&gt;But have you ever received a question from a language model? While not as common, this application of AI has diverse use cases in areas like education, where you might want a model to give you practice questions for a test, and in sales enablement, where you question your business's sales team about your products to improve their ability to make sales.&lt;/p&gt;

&lt;p&gt;Now, &lt;strong&gt;what if we had a face off⚔️ between two different models&lt;/strong&gt;: one that asked questions about a topic and another that answered them? All without human intervention?&lt;/p&gt;

&lt;p&gt;In this article, we're going to look at exactly that. We'll provide a sample passage about OpenAI's AI safety team as context to our models. We'll then let our models duel it out! One model will ask questions based on this passage, and another model will respond!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe8905k96aob2nvvd1sj.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe8905k96aob2nvvd1sj.gif" alt="Duel GIF" width="500" height="208"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Our AI Models🤖
&lt;/h2&gt;

&lt;p&gt;Intoducing, &lt;code&gt;slim-q-gen-tiny-tool&lt;/code&gt;. This will be our question model, capable of generating 3 different types of questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple choice questions&lt;/li&gt;
&lt;li&gt;Boolean (true/false) questions&lt;/li&gt;
&lt;li&gt;General open-ended questions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Facing off against this will be &lt;code&gt;bling-phi-3-gguf&lt;/code&gt;! This will be our answer model, giving appropriate responses to any of the above types of questions.&lt;/p&gt;

&lt;p&gt;One important note is that both these models are &lt;em&gt;GGUF quantized&lt;/em&gt;. This means that they are smaller and faster versions of their original counterparts. What this means for us is that we can run them on just a CPU, with no need for resources like GPUs!&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Providing input parameters✏️
&lt;/h2&gt;

&lt;p&gt;This is what our function signature for this example looks like.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask_and_answer_game&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_passage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slim-q-gen-tiny-tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;number_of_tries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;source_passage&lt;/code&gt; is the text input that we will provide our models,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;q_model&lt;/code&gt; is our questioning model,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;number_of_tries&lt;/code&gt; is the number of questions we will attempt to generate (more on this later!)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;question_type&lt;/code&gt; can be either &lt;code&gt;"multiple choice"&lt;/code&gt;, &lt;code&gt;"boolean"&lt;/code&gt; or &lt;code&gt;"question"&lt;/code&gt; corresponding to each of the types of questions we saw above,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;temperature&lt;/code&gt; is a value ranging from 0 to 1 that determines how much variance we will see in our generated questions. Here, the value of 0.5 is relatively high so that we get a good variety of questions with little repetition!&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 2: Loading in our models🪫🔋
&lt;/h2&gt;

&lt;p&gt;With the inputs taken care of, let's now load in both our models.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;q_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelCatalog&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that we have &lt;code&gt;sample=True&lt;/code&gt; to increase variety in our model output (the questions generated).&lt;/p&gt;

&lt;p&gt;Now, for the answer model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;answer_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelCatalog&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bling-phi-3-gguf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We won't mess with the &lt;code&gt;sample&lt;/code&gt; or &lt;code&gt;temperature&lt;/code&gt; options here because we want concise, fact-based answers from this model.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Generating our questions🤔💬
&lt;/h2&gt;

&lt;p&gt;We'll try to generate questions &lt;code&gt;number_of_tries&lt;/code&gt; times, which in this case is 10. We'll then then update our &lt;code&gt;questions&lt;/code&gt; list with only the unique questions, to avoid repetitions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;questions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="c1"&gt;# Loop number_of_tries times
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;number_of_tries&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;q_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_passage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;question_type&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;new_q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Check to see that the question generated is unique
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;new_q&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;new_q&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_q&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An important function here is &lt;code&gt;q_model.function_call()&lt;/code&gt;. This is how the &lt;code&gt;llmware&lt;/code&gt; library lets you prompt language models with &lt;strong&gt;just a single function call&lt;/strong&gt;. Here, we pass in the source text and question type as arguments.&lt;/p&gt;

&lt;p&gt;The function returns &lt;code&gt;response&lt;/code&gt;, a dictionary with a lot of information about the call, but we're only interested in the &lt;code&gt;question&lt;/code&gt; key, which is located inside the &lt;code&gt;llm_response&lt;/code&gt; dictionary.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: Responding to our questions📝
&lt;/h2&gt;

&lt;p&gt;Now that the questions have been generated, &lt;strong&gt;the duel is on!&lt;/strong&gt; Let's use our answering model to now respond to these questions. We'll loop through our &lt;code&gt;questions&lt;/code&gt; list, pass in the source passage as context to the model and ask each question.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Loop through each question
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Print out the question
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Validate the question list and run inference
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;answer_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;inference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;add_context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;test_passage&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Print out the answer
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is important to note that our question model returns each &lt;code&gt;question&lt;/code&gt; as a &lt;code&gt;list&lt;/code&gt;, with the first element (&lt;code&gt;question[0]&lt;/code&gt;) containing the actual string corresponding to the question.&lt;/p&gt;

&lt;p&gt;For each &lt;code&gt;question&lt;/code&gt;, we then need to perform some validation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check to see that the &lt;code&gt;question&lt;/code&gt; is of the correct data type (&lt;code&gt;list&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Check to see that the &lt;code&gt;question&lt;/code&gt; is not empty.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then, the &lt;code&gt;answer_model.inference()&lt;/code&gt; function will ask our model the question, passing in the &lt;code&gt;test_passage&lt;/code&gt; as context.&lt;/p&gt;

&lt;p&gt;Finally, we print out the response.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results!✅
&lt;/h2&gt;

&lt;p&gt;Let's quickly look at our sample passage. This passage was taken from a CNBC news story in May 2024 about OpenAI's work with safety and security.&lt;/p&gt;

&lt;p&gt;"OpenAI said Tuesday it has established a new committee to make recommendations to the company’s board about safety and security, weeks after dissolving a team focused on AI safety. In a blog post, OpenAI said the new committee would be led by CEO Sam Altman as well as Bret Taylor, the company’s board chair, and board member Nicole Seligman. The announcement follows the high-profile exit this month of an OpenAI executive focused on safety, Jan Leike. Leike resigned from OpenAI leveling criticisms that the company had under-invested in AI safety work and that tensions with OpenAI’s leadership had reached a breaking point."&lt;/p&gt;

&lt;p&gt;Now, let's see what our output looks like!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0w13ads9o8px7wdrihw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0w13ads9o8px7wdrihw.png" alt="Sample output" width="711" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see all the questions that were asked about the passage, as well as concise, fact-based responses given to them!&lt;/p&gt;

&lt;p&gt;Note that there are only 9 questions here while we provided &lt;code&gt;number_of_tries=10&lt;/code&gt;. This means that one question generated was a duplicate and was ignored.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;And with that, we're done with this example! Recall that we used the &lt;code&gt;llmware&lt;/code&gt; library to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load in a question and answer model&lt;/li&gt;
&lt;li&gt;Generate unique questions about a source passage&lt;/li&gt;
&lt;li&gt;Respond to each question with accuracy.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;And remember that we did all of this on just a CPU!&lt;/strong&gt; 💻&lt;/p&gt;

&lt;p&gt;Check out our YouTube videon on this example!&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/380Yr2bc_Qk?start=143"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;If you made it this far, thank you for taking the time to go through this topic with us ❤️! For more content like this, make sure to &lt;a href="https://dev.to/llmware"&gt;visit our dev.to page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The source code for many more examples like this one are on &lt;a href="https://github.com/llmware-ai/llmware" rel="noopener noreferrer"&gt;our GitHub&lt;/a&gt;. Find this example &lt;a href="https://github.com/llmware-ai/llmware/blob/a58c2dc7ea94c1a8eef87bc0fd1cc34fb616c743/examples/SLIM-Agents/using-slim-q-gen.py" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Our repository also contains a &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks/NoteBook_Examples/using-slim-q-gen-notebook.ipynb" rel="noopener noreferrer"&gt;notebook for this example&lt;/a&gt; that you can run yourself using Google Colab, Jupyter or any other platform that supports .ipynb notebooks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.gg/fCztJQeV7J" rel="noopener noreferrer"&gt;Join our Discord&lt;/a&gt; to interact with a growing community of AI enthusiasts of all levels of experience!&lt;/p&gt;

&lt;p&gt;Please be sure to visit our website &lt;a href="https://llmware.ai/" rel="noopener noreferrer"&gt;llmware.ai&lt;/a&gt; for more information and updates.&lt;/p&gt;

</description>
      <category>python</category>
      <category>beginners</category>
      <category>ai</category>
      <category>rag</category>
    </item>
    <item>
      <title>🚀Supercharged SLIM models Multistep RAG analysis that never leaves your CPU🧑‍💻</title>
      <dc:creator>Simon Risman</dc:creator>
      <pubDate>Tue, 02 Jul 2024 13:30:11 +0000</pubDate>
      <link>https://forem.com/llmware/supercharged-slim-models-multistep-rag-analysis-that-never-leaves-your-cpu-26j0</link>
      <guid>https://forem.com/llmware/supercharged-slim-models-multistep-rag-analysis-that-never-leaves-your-cpu-26j0</guid>
      <description>&lt;p&gt;Many of us are used to models running in the cloud, sending API calls to far-away servers, filed away as training data for the next wave of GPTs. And how else would this even work? Surely an individual laptop just doesn't have the power to manage and execute the workflows that a cloud based service does. &lt;/p&gt;

&lt;p&gt;Consider, for a moment, the mighty ant. At first glance, it may seem insignificant—a mere speck in the grand tapestry of nature. Yet, beneath its tiny exterior lies a powerhouse of strength, resilience, and ingenuity. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExaW1xM2hjeWRpb3MzZDVrNmFyMmZwNW82dGFxcHoxcnVoank1b3UzZiZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/26BRsfBU7ct4jgaCQ/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExaW1xM2hjeWRpb3MzZDVrNmFyMmZwNW82dGFxcHoxcnVoank1b3UzZiZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/26BRsfBU7ct4jgaCQ/giphy.gif" width="400" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter &lt;strong&gt;SLIM&lt;/strong&gt; - &lt;strong&gt;S&lt;/strong&gt;tructured &lt;strong&gt;L&lt;/strong&gt;anguage &lt;strong&gt;I&lt;/strong&gt;nstruction &lt;strong&gt;M&lt;/strong&gt;odels.🏋️
&lt;/h2&gt;

&lt;p&gt;These models are tiny and run comfortably on a CPU, but pack a punch when it comes to providing specialized, structured outputs. Instead of an AI summary being more bullet points or god forbid paragraphs, SLIM models output a variety of structured data like CSVs, JSONs, and SQL. &lt;/p&gt;

&lt;p&gt;The highly specialized nature of the SLIM models is precisely what makes them so powerful - instead of a general solution to a large problem, stringing together a few SLIM models yields more robust performance with greater flexibility.&lt;/p&gt;

&lt;p&gt;To show just how much these models can do, we are going to take a look at a tech tale worthy of invoking Gavin Belson: The partnership-turned-rivalry between Microsoft and IBM.&lt;/p&gt;

&lt;p&gt;🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜&lt;/p&gt;

&lt;h2&gt;
  
  
  0️⃣ Setup 🛠️
&lt;/h2&gt;

&lt;p&gt;Make sure you have installed llmware and imported the libraries we are going to use. The code below should get you all set up. &lt;/p&gt;

&lt;p&gt;Run this command in your terminal&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install llmware
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add these imports to the top of your code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;shutil&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LLMfx&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.library&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Library&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.retrieval&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Query&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.configs&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LLMWareConfig&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.setup&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Setup&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  1️⃣ Build a Knowledge Base of Microsoft Documents 📖
&lt;/h2&gt;

&lt;p&gt;First we need to create a database to query. In your case it can be anything from customer service reports to earnings calls, but for now we will use a range of Microsoft-related documents.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;multistep_analysis&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt; In this example, our objective is to research Microsoft history and rivalry in the 1980s with IBM. &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;#   step 1 - assemble source documents and create library
&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: Starting example - agent-multistep-analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;#   note: the program attempts to automatically pull sample document into local path
&lt;/span&gt;    &lt;span class="c1"&gt;#   depending upon permissions in your environment, you may need to set up directly
&lt;/span&gt;    &lt;span class="c1"&gt;#   if you pull down the samples files with Setup().load_sample_files(), in the Books folder,
&lt;/span&gt;    &lt;span class="c1"&gt;#   you will find the source: "Bill-Gates-Biography.pdf"
&lt;/span&gt;    &lt;span class="c1"&gt;#   if you have pulled sample documents in the past, then to update to latest: set over_write=True
&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: Loading sample files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;sample_files_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Setup&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_sample_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;over_write&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;bill_gates_bio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bill-Gates-Biography.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;path_to_bill_gates_bio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_files_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Books&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bill_gates_bio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;microsoft_folder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;LLMWareConfig&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get_tmp_path&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;example_microsoft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: attempting to create source input folder at path: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;microsoft_folder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;microsoft_folder&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;microsoft_folder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chmod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;microsoft_folder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mo"&gt;0o777&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;shutil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path_to_bill_gates_bio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;microsoft_folder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bill_gates_bio&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;#   create library
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: creating library and parsing source document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nc"&gt;LLMWareConfig&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;set_active_db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sqlite&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;my_lib&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Library&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;create_new_library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;microsoft_history_0210_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;my_lib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;microsoft_folder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2️⃣ Locate Mentions of IBM and Create an Agent to Process Them 🔍
&lt;/h2&gt;

&lt;p&gt;In our first pass  we focus on any mention of IBM, and since we have a multistep process we can analyse these instances on a more granular level.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ibm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;search_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_lib&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;text_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: executing query to filter to key passages - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; - results found - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;#   create an agent and load several tools that we will be using
&lt;/span&gt;    &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LLMfx&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_tool_list&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sentiment&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;emotions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ner&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="c1"&gt;#   load the search results into the agent's work queue
&lt;/span&gt;    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_work&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3️⃣ Pick out Negative Sentiment 🫳
&lt;/h2&gt;

&lt;p&gt;This is where you get to decide the depth of your analysis for each item. For our scenario, we want only mentions of IBM that carry negative sentiment (evidence of the rivalry.)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;increment_work_iteration&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;

    &lt;span class="c1"&gt;#   analyze sections where the sentiment on ibm was negative
&lt;/span&gt;    &lt;span class="n"&gt;follow_up_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;follow_up_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sentiment&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;negative&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4️⃣ Deep Dive Analysis 🤿
&lt;/h2&gt;

&lt;p&gt;Now that we have picked out the instances we want to explore further, we arm our agent with tools - each tool is a SLIM model built to perform at the highest level on each individual task, providing a comprehensive overview of the pertinent results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;job_index&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;follow_up_list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

        &lt;span class="c1"&gt;# follow-up 'deep dive' on selected text that references ibm negatively
&lt;/span&gt;        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_work_iteration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job_index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exec_multitool_function_call&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;emotions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topics&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ner&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is a brief summary?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;my_report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;follow_up_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;activity_summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;activity_summary&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;my_report&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my report entries: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;my_report&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results 🎉🎉🎉
&lt;/h2&gt;

&lt;p&gt;Your multi-step local RAG model should return a filled out dictionary that looks something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;report 1 entries:  {'sentiment': ['negative'], 'tags': '["IBM", "COBOL", "PL/1", "BAL", "OS/2", "Presentation Manager", "K.", "OS/2 1.0", "December 1987", "1.0"]', 'emotions': ['anger'], 'topics': ['ibm'], 'people': [], 'organization': ['IBM'], 'misc': ['OS/2', 'Presentation Manager'], 'summary': ['•IBM wrote "clunky" code that was top-heavy with lines of documentation to make the software "easy to service."\t\t•IBM wrote "clunky" code that was top-heavy with lines of documentation to make the software "easy to service."\t\t•IBM wrote "clunky" code that was top-heavy with lines of documentation to make the software "easy to service."\t\t•IBM wrote'], 'source': {'query': 'ibm', '_id': '174', 'text': 'writers were contemptuous of IBM and it\'s coding   culture. In the increasingly irrelevant world of IBM, the classical   languages were COBOL, PL/1, and BAL (Basic Assembly Language),   NOT C!    J.    In addition, IBM wrote "clunky" code that was top-heavy with lines of   documentation to make the software "easy to service."   K.    Finally, in December 1987 OS/2 1.0 without Presentation Manager ', 'doc_ID': 1, 'block_ID': 173, 'page_num': 35, 'content_type': 'text', 'author_or_speaker': 'IBM_User', 'special_field1': '', 'file_source': 'Bill-Gates-Biography.pdf', 'added_to_collection': 'Mon Jul  1 13:14:36 2024', 'table': '', 'coords_x': 162, 'coords_y': 414, 'coords_cx': 34, 'coords_cy': 45, 'external_files': '', 'score': -4.040003091801133, 'similarity': 0.0, 'distance': 0.0, 'matches': [[29, 'ibm'], [100, 'ibm'], [215, 'ibm']], 'account_name': 'llmware', 'library_name': 'microsoft_history_0210_1'}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The beauty of the output is the structured nature. You could easily write a program to hand off your report to, a program that wouldn't need to waste precious time parsing natural language and could just flip to the right part of the dictionary. Besides saving time, you also increase accuracy and consistency. &lt;/p&gt;

&lt;p&gt;If you want to learn more, below is a video walkthrough for this tutorial.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/y4WvwHqRR60"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;The full code for this example can be found in our &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/SLIM-Agents/agent-multistep-analysis.py" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you have any questions, or would like to learn more about LLMWARE, come to our Discord community. Click &lt;a href="https://discord.gg/6nNVdn7A" rel="noopener noreferrer"&gt;here&lt;/a&gt; to join. See you there!🚀🚀🚀&lt;/p&gt;

&lt;p&gt;Please be sure to visit our website &lt;a href="https://llmware.ai/" rel="noopener noreferrer"&gt;llmware.ai&lt;/a&gt; for more information and updates.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>🔉From Sound to Insights: Using AI🤖 for Audio File Transcription and Analysis!🚀</title>
      <dc:creator>Prashant Iyer</dc:creator>
      <pubDate>Fri, 28 Jun 2024 20:57:16 +0000</pubDate>
      <link>https://forem.com/llmware/from-sound-to-insights-using-ai-for-audio-file-transcription-and-analysis-36ek</link>
      <guid>https://forem.com/llmware/from-sound-to-insights-using-ai-for-audio-file-transcription-and-analysis-36ek</guid>
      <description>&lt;p&gt;If we were given an audio file, is there any way we could identify the time stamps where specific words were said? Is there any way we could extract all the key words mentioned about a topic?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With AI 🤖, we can do all of this and much more!&lt;/strong&gt; The key lies in being able to parse audio into text, allowing us to then harness the natural language processing capabilities of &lt;em&gt;language models&lt;/em&gt; to perform sophisticated analyses and inferences on our data.&lt;/p&gt;

&lt;p&gt;Regardless of who you are, such an approach to audio transcription and analysis will augment how you interact with and extract knowledge from audio files.&lt;/p&gt;

&lt;p&gt;Let's see how we can do this with &lt;code&gt;llmware&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  AI Tools 🤖
&lt;/h2&gt;

&lt;p&gt;We'll be using two models for this example.&lt;/p&gt;

&lt;p&gt;The first is Whisper by OpenAI. This is the model that will allow us to parse the audio files, i.e. convert them from audio to text.&lt;/p&gt;

&lt;p&gt;The second is the SLIM (&lt;em&gt;Structured Language Instruction Model&lt;/em&gt;) Extract Tool by LLMWare, which we'll be using to ask questions about our audio. This is a &lt;em&gt;GGUF quantized&lt;/em&gt; version of a much larger model called &lt;em&gt;slim-extract&lt;/em&gt;. All this means is that our model, the SLIM Extract Tool, is a smaller and faster version of the original model. &lt;strong&gt;This allows us to run it locally on a CPU&lt;/strong&gt;, without the need for powerful computational resources like GPUs!&lt;/p&gt;

&lt;p&gt;With that out of the way, let's get started with the example.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Loading in audio files 🔉🔉
&lt;/h2&gt;

&lt;p&gt;If you have audio files that you want to run the example with, then feel free to use those by setting &lt;code&gt;input_folder&lt;/code&gt; appropriately, but if not, the &lt;code&gt;llmware&lt;/code&gt; library provides you with several sets of sample audio files!&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;voice_sample_files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Setup&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_voice_sample_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;small_only&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;input_folder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;voice_sample_files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;greatest_speeches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, we're loading in the &lt;code&gt;greatest_speeches&lt;/code&gt; set of audio files.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Parsing our audio files 📝
&lt;/h2&gt;

&lt;p&gt;Now that we have our audio files, we can go about parsing them into chunks of text. Recall that we'll be needing the WhisperCPP model to do this. But fortunately, you won't have to directly interact with the model yourself since the &lt;code&gt;Parser&lt;/code&gt; class from the &lt;code&gt;llmware&lt;/code&gt; library will take care of this for you!&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;parser_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Parser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;parse_voice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_folder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write_to_db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;copy_to_library&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;remove_segment_markers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_by_segment&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;real_time_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, the &lt;code&gt;chunk_size&lt;/code&gt; and &lt;code&gt;max_chunk_size&lt;/code&gt; indicate how big each chunk of parsed text will be. We're passing in our folder containing the audio files to the &lt;code&gt;parse_voice()&lt;/code&gt; function of the &lt;code&gt;Parser&lt;/code&gt; class.&lt;/p&gt;

&lt;p&gt;The function does accept many more optional arguments about how we'd like the audio to be parsed, but we can ignore them for this example.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Text searching 🕵️
&lt;/h2&gt;

&lt;p&gt;Let's now run a text search on our parsed audio. We can try searching for the word "president". What this means is that we want to find all the portions of the audio and corresponding text that have the word "president" in it. We can do this using the &lt;code&gt;fast_search_dicts()&lt;/code&gt; function in the &lt;code&gt;Utilies&lt;/code&gt; class in the &lt;code&gt;llmware&lt;/code&gt; library.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Utilities&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;fast_search_dicts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;president&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parser_output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Step 4: Making an AI call on text chunks 🤖
&lt;/h2&gt;

&lt;p&gt;Now that we have a list of text blocks containing the word "president", lets use an AI model to identify which presidents are being mentioned in the selected text blocks.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;extract_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelCatalog&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slim-extract-tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, we're using the &lt;code&gt;ModelCatalog&lt;/code&gt; class to load in our SLIM Extract Tool. Let's now iterate over each text block containing "president".&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;final_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extract_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;president name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We're making a &lt;code&gt;function_call()&lt;/code&gt; for "president name". &lt;strong&gt;This is how we ask our Tool to identify the president name in the text block.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5: Analyzing our output 🔍
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;function_call()&lt;/code&gt; function would have returned a dictionary containing a lot of data about the function call. We specifically want the &lt;code&gt;president_name&lt;/code&gt; key in the dictionary.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;extracted_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;president_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;president_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;extracted_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;president_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;update: skipping result - no president name found - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If the value of the &lt;code&gt;president_name&lt;/code&gt; key is a non-empty string, then we store its value in &lt;code&gt;extracted_name&lt;/code&gt;. Otherwise, no result was found and we print this out.&lt;/p&gt;

&lt;p&gt;Now lets see if the president name matched any of the recent American presidents in this list:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;various_american_presidents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kennedy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;carter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nixon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reagan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clinton&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;obama&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To do this, we'll check if the &lt;code&gt;extracted_name&lt;/code&gt; contains any of these American presidents. If we have a match, then we'll add it to our &lt;code&gt;final_list&lt;/code&gt; as a dictionary containing some information about the location of the name in the audio as well as the text block it was in.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;president&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;various_american_presidents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;president&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;extracted_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;final_list&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;president&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;time_start&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coords_x&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]})&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Results! ✅
&lt;/h2&gt;

&lt;p&gt;Let's now output the &lt;code&gt;final_list&lt;/code&gt;.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;final_list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final results: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is what an one search result in the output would look after running the code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz3g0jpwu6s1vq2mvwtjc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz3g0jpwu6s1vq2mvwtjc.png" alt="Sample output"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here, we have a Python dictionary as output containing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;key&lt;/code&gt;: the name of the president identified, which here is "kennedy"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;source&lt;/code&gt;: the audio file this was found in, which here is "ConcessionStand.wav"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;time_start&lt;/code&gt;: the time stamp in seconds where the president was mentioned, which here is 339.9 seconds&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;text&lt;/code&gt;: which contains the text chunk the name was found in.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;And we're done! To recap, we were able to parse our audio files into text, run a text search on them for the word "president", and then use our SLIM Extract Tool to identify the specific presidents named in our text chunks! &lt;strong&gt;And remember that we did all this on just a CPU! 💻&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Be sure to check out our YouTube video on this example!&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/5y0ez5ZBpPE?start=804"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;If you made it this far, thank you for taking the time to go through this topic with us ❤️! For more content like this, make sure to &lt;a href="https://dev.to/llmware"&gt;visit our dev.to page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The source code for many more examples like this one are on &lt;a href="https://github.com/llmware-ai/llmware" rel="noopener noreferrer"&gt;our GitHub&lt;/a&gt;. Find this example &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Use_Cases/parsing_great_speeches.py" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Our repository also contains a &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks/NoteBook_Examples/parsing_great_speeches_notebook.ipynb" rel="noopener noreferrer"&gt;notebook for this example&lt;/a&gt; that you can run yourself using Google Colab, Jupyter or any other platform that supports .ipynb notebooks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.gg/fCztJQeV7J" rel="noopener noreferrer"&gt;Join our Discord&lt;/a&gt; to interact with a growing community of AI enthusiasts of all levels of experience!&lt;/p&gt;

&lt;p&gt;Please be sure to visit our website &lt;a href="https://llmware.ai/" rel="noopener noreferrer"&gt;llmware.ai&lt;/a&gt; for more information and updates.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>beginners</category>
    </item>
    <item>
      <title>🤖AI-Powered Contract Queries: Use Language Models for Effective Analysis!🔥</title>
      <dc:creator>Prashant Iyer</dc:creator>
      <pubDate>Fri, 28 Jun 2024 20:48:31 +0000</pubDate>
      <link>https://forem.com/llmware/ai-powered-contract-queries-use-language-models-for-effective-analysis-461o</link>
      <guid>https://forem.com/llmware/ai-powered-contract-queries-use-language-models-for-effective-analysis-461o</guid>
      <description>&lt;p&gt;Imagine you were given a large contract and asked a really specific question about it: "What is the notice for termination for convenience?" It would be an ordeal to locate the answer for this in the contract.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But what if we could use AI 🤖&lt;/strong&gt; to analyze the contract and answer this for us?&lt;/p&gt;

&lt;p&gt;What we want here is to perform something known as &lt;em&gt;retrieval-augmented generation&lt;/em&gt; (RAG). This is the process by which we give a &lt;em&gt;language model&lt;/em&gt; some external sources (such as a contract). The external sources are intended to enhance the model's context, giving it a more comprehensive understanding of a topic. The model should then give us more accurate responses to the questions we ask it on the topic.&lt;/p&gt;

&lt;p&gt;Now, a general purpose model like Chat-GPT might be able to answer questions about contracts with RAG, but &lt;strong&gt;what if we instead used a model that's been trained and fine-tuned&lt;/strong&gt; specifically on contract data?&lt;/p&gt;




&lt;h2&gt;
  
  
  Our AI model 🤖
&lt;/h2&gt;

&lt;p&gt;For this example, we'll be using LLMWare's &lt;em&gt;dragon-yi-6b-gguf&lt;/em&gt; model. This model is RAG-finetuned for fact-based question-answering on complex business and legal documents.&lt;/p&gt;

&lt;p&gt;This means that it is specialized in giving us short and concise responses to questions involving documents like contracts. This makes it perfect for our example!&lt;/p&gt;

&lt;p&gt;This is also a &lt;em&gt;GGUF quantized&lt;/em&gt; model, meaning that it is a smaller and faster version of the original 6 billion parameter &lt;em&gt;dragon-yi-6b&lt;/em&gt; model. Fortunately for us, this means that &lt;strong&gt;we can run it on a CPU 💻&lt;/strong&gt; without the need for powerful computational resources like GPUs!&lt;/p&gt;

&lt;p&gt;Now, let's look at an example of using the &lt;code&gt;llmware&lt;/code&gt; library for contract analysis from start to end!&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Loading in files 📁
&lt;/h2&gt;

&lt;p&gt;Let's start off by loading in our contracts to be analyzed. The &lt;code&gt;llmware&lt;/code&gt; library provides sample contracts via the &lt;code&gt;Setup&lt;/code&gt; class, but you can also use your own files in this example by replacing the &lt;code&gt;agreements_path&lt;/code&gt; below.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;local_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Setup&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_sample_files&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;agreements_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AgreementsLarge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, we load in the &lt;code&gt;AgreementsLarge&lt;/code&gt; set of files.&lt;/p&gt;

&lt;p&gt;Next, we'll create a &lt;code&gt;Library&lt;/code&gt; object and add our sample files to this library. An &lt;code&gt;llmware&lt;/code&gt; library breaks documents down into text chunks and stores them in a database so that we can access them easily later.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;msa_lib&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Library&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;create_new_library&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;msa_lib503_635&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;msa_lib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agreements_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Step 2: Locating MSA files 🔍
&lt;/h2&gt;

&lt;p&gt;Let's say that we want to consider only MSA (master services agreements) files from our sample contracts.&lt;/p&gt;

&lt;p&gt;We can first create a &lt;code&gt;Query&lt;/code&gt; object containing all our files, and then run a &lt;code&gt;text_search_by_page()&lt;/code&gt; to filter only the files that contain "master services agreement" on their front page.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msa_lib&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;master services agreement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_search_by_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_num&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;results_only&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;msa_docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;results&lt;/code&gt; from the text search will be a dictionary containing detailed information about the text query. However, we're only interested in the &lt;code&gt;file_source&lt;/code&gt; key representing the file names.&lt;/p&gt;

&lt;p&gt;Great! We now have our MSA files.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38f6e5r3ki5xwrhxgve0.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38f6e5r3ki5xwrhxgve0.gif" alt="Simpsons GIF"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: Loading our model 🪫🔋
&lt;/h2&gt;

&lt;p&gt;Now, we can load in our model using the &lt;code&gt;Prompt&lt;/code&gt; class in the &lt;code&gt;llmware&lt;/code&gt; library.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llmware/dragon-yi-6b-gguf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;prompter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Step 4: Analyzing our files using AI 🧠💡
&lt;/h2&gt;

&lt;p&gt;Let's now iterate over our MSA files, and for each file, we'll:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;identify the text chunks containing the word "termination",&lt;/li&gt;
&lt;li&gt;add those chunks as a source for our AI call, and&lt;/li&gt;
&lt;li&gt;run the AI call "What is the notice for termination for convenience?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We can start by performing a text query for the word "termination".&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msa_docs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;doc_filter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
    &lt;span class="n"&gt;termination_provisions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_query_with_document_filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;termination&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc_filter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We'll then add these &lt;code&gt;termination_provisions&lt;/code&gt; as a source to our model.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;sources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_source_query_results&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;termination_provisions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;And with that done, we can call the LLM and ask it our question.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prompter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt_with_source&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the notice for termination for convenience?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Results! ✅
&lt;/h2&gt;

&lt;p&gt;Let's print out our &lt;code&gt;response&lt;/code&gt; and see what the output looks like.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: llm response - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here's what the output of our code looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1z3rck11d71foci7rvu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1z3rck11d71foci7rvu.png" alt="Sample output"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What we have is a Python dictionary with several keys, notably:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;llm_response&lt;/code&gt;: giving us the answer to our question, which here is "30 days written notice"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;evidence&lt;/code&gt;: giving us the text where the model found the answer to the question&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The dictionary also contains detailed information about the metadata of the AI call, but these are not relevant to our example and have been omitted from the output above.&lt;/p&gt;




&lt;h2&gt;
  
  
  Human in the loop! 👤
&lt;/h2&gt;

&lt;p&gt;We're not done just yet! If we wanted to generate a CSV report for a human to review the results of our analysis, we can make use of the &lt;code&gt;HumanInTheLoop&lt;/code&gt; class. All we need to do is save the current state of our &lt;code&gt;prompter&lt;/code&gt; and call the &lt;code&gt;export_current_interaction_to_csv()&lt;/code&gt; function.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;

&lt;span class="n"&gt;prompter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_state&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;csv_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HumanInTheLoop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompter&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;export_current_interaction_to_csv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;And that brings us to the end of our example! To summarize, we used the &lt;code&gt;llmware&lt;/code&gt; library to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load in sample files&lt;/li&gt;
&lt;li&gt;Filter only the MSA files&lt;/li&gt;
&lt;li&gt;Use the dragon-yi-6b-gguf model to ask questions about termination provisions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;And remember that we did all of this on just a CPU! 💻&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Check out our YouTube video on this example!&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/Cf-07GBZT68"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;If you made it this far, thank you for taking the time to go through this topic with us ❤️! For more content like this, make sure to &lt;a href="https://dev.to/llmware"&gt;visit our dev.to page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The source code for many more examples like this one are on &lt;a href="https://github.com/llmware-ai/llmware" rel="noopener noreferrer"&gt;our GitHub&lt;/a&gt;. Find this example &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Use_Cases/msa_processing.py" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Our repository also contains a &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks/NoteBook_Examples/msa_processing_notebook.ipynb" rel="noopener noreferrer"&gt;notebook for this example&lt;/a&gt; that you can run yourself using Google Colab, Jupyter or any other platform that supports .ipynb notebooks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://discord.gg/fCztJQeV7J" rel="noopener noreferrer"&gt;Join our Discord&lt;/a&gt; to interact with a growing community of AI enthusiasts of all levels of experience!&lt;/p&gt;

&lt;p&gt;Please be sure to visit our website &lt;a href="https://llmware.ai/" rel="noopener noreferrer"&gt;llmware.ai&lt;/a&gt; for more information and updates.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>beginners</category>
    </item>
    <item>
      <title>The Hardest Problem in RAG... Handling 'NOT FOUND' Answers 🔍🤔</title>
      <dc:creator>Will Taner</dc:creator>
      <pubDate>Mon, 24 Jun 2024 17:04:03 +0000</pubDate>
      <link>https://forem.com/llmware/the-hardest-problem-in-rag-handling-not-found-answers-7md</link>
      <guid>https://forem.com/llmware/the-hardest-problem-in-rag-handling-not-found-answers-7md</guid>
      <description>&lt;h2&gt;
  
  
  First of All... What is RAG? 🕵️‍♂️
&lt;/h2&gt;

&lt;p&gt;Retrieval-Augmented Generation (RAG) is an approach to natural language processing that references external documents to provide more accurate and contextually relevant answers. Despite its advantages, RAG faces some challenges, one of which is handling 'NOT FOUND' answers. Addressing this issue is crucial for developing an effective and reliable model that everyone can use.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why 'NOT FOUND' Answers Can Be Concerning ⛔️
&lt;/h2&gt;

&lt;p&gt;Some models respond with "hallucinations" when they cannot find an answer, creating inaccurate responses that may mislead the user. This can undermine the trust users have in the model, making it less reliable and effective.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Can We Remedy This? 🛠️
&lt;/h2&gt;

&lt;p&gt;For starters, it is better for the model to inform the user that it could not find the answer rather than fabricating one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/wocf9LwylhReij2MZE/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/wocf9LwylhReij2MZE/giphy.gif" alt="It's Okay" width="480" height="269"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, we will delve into one way LLMWare handles 'NOT FOUND' cases effectively. By examining these methods, we can gain a better understanding of how to address this issue and enhance the overall performance and reliability of RAG systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  For the Visual Learners... 📺
&lt;/h2&gt;

&lt;p&gt;Here is a video discussing the same topic as this article. A good idea would be to watch the video, and then work through the steps in this article.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/slDeF7bYuv0"&gt;
&lt;/iframe&gt;
&lt;/p&gt;




&lt;h2&gt;
  
  
  Framework 🖼️
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;LLMWare&lt;/strong&gt;&lt;br&gt;
For our new readers, LLMWARE is a comprehensive, open-source framework that provides a unified platform for application patterns based on LLMs, including Retrieval Augmented Generation (RAG).&lt;/p&gt;

&lt;p&gt;Please run &lt;code&gt;pip3 install llmware&lt;/code&gt; in the command line to download the package.&lt;/p&gt;


&lt;h2&gt;
  
  
  Import Libraries and Create Context 📚
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ModelCatalog&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WikiParser&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;ModelCatalog&lt;/strong&gt;: A class within &lt;code&gt;llmware&lt;/code&gt; that manages selecting the desired model, loading the model, and configuring the model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WikiParser&lt;/strong&gt;: A class within &lt;code&gt;llmware&lt;/code&gt; that handles the retrieval and packaging of content from Wikipedia.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BEAVERTON, Ore.--(BUSINESS WIRE)--NIKE, Inc. (NYSE:NKE) today reported fiscal 2024 financial results for its &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;third quarter ended February 29, 2024.) “We are making the necessary adjustments to drive NIKE’s next chapter &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;of growth Post this Third quarter revenues were slightly up on both a reported and currency-neutral basis* &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;at $12.4 billion NIKE Direct revenues were $5.4 billion, slightly up on a reported and currency-neutral basis &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NIKE Brand Digital sales decreased 3 percent on a reported basis and 4 percent on a currency-neutral basis &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Wholesale revenues were $6.6 billion, up 3 percent on a reported and currency-neutral basis Gross margin &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;increased 150 basis points to 44.8 percent, including a detriment of 50 basis points due to restructuring charges &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Selling and administrative expense increased 7 percent to $4.2 billion, including $340 million of restructuring &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;charges Diluted earnings per share was $0.77, including $0.21 of restructuring charges. Excluding these &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;charges, Diluted earnings per share would have been $0.98* “We are making the necessary adjustments to &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;drive NIKE’s next chapter of growth,” said John Donahoe, President &amp;amp; CEO, NIKE, Inc. “We’re encouraged by &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;the progress we’ve seen, as we build a multiyear cycle of new innovation, sharpen our brand storytelling and &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;work with our wholesale partners to elevate and grow the marketplace.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the initial text for our extraction. It provides details about the popular sports brand, Nike. Feel free to modify this text to suit your needs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/KEYEpIngcmXlHetDqz/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/KEYEpIngcmXlHetDqz/giphy.gif" alt="It's All Coming Together" width="480" height="284"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Create Key for Extraction 🔐
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;extract_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company founding date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;dict_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extract_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we set the company founding date as the target extraction from the text.&lt;/p&gt;




&lt;h2&gt;
  
  
  Run Initial Extract 🏃
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelCatalog&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slim-extract-tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;extract_key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;llm_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;: In this snippet, we load LLMWare's slim-extract-tool, which is a 2.8B parameter GGUF model that is fine tuned for general-purpose extraction (GGUF is a quantization method that allows for quicker inference time and decreased model size at the cost of accuracy).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Temperature&lt;/strong&gt;: This controls the randomness of the output. Valid values range between 0 and 1, where lower values make the model more deterministic, and higher values make the model more random and creative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sample&lt;/strong&gt;: Determines if the output is generated deterministically or probabilistically. False generates deterministic output. True generates probabilistic output.&lt;/p&gt;

&lt;p&gt;We then attempt to extract the information from the text using the model and store it in &lt;code&gt;llm_response&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  If Answer is Found... ✅
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dict_key&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;llm_response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

        &lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dict_key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

            &lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: found the &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;extract_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; value - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the model successfully finds and extracts the company founding date, we will return the information.&lt;/p&gt;




&lt;h2&gt;
  
  
  If Answer is Not Found... ❌
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: did not find the target value in the text - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: initiating a secondary process to try to find the information&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the model does not find the company founding date, we will run a second query to find the company name for future use in gathering more information.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/3ov9kbWOowpMeMTu6c/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/3ov9kbWOowpMeMTu6c/giphy.gif" alt="Emmy Awards" width="480" height="266"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Retrieve Information from Wiki 📖
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;company_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;update: found the company name - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; - now using to lookup in secondary source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;WikiParser&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;add_wiki_topic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;target_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After extracting the company's name from the text, we will then retrieve additional information about the company from Wiki.&lt;/p&gt;




&lt;h2&gt;
  
  
  Generate a Summary Snippet from Retrieved Article Data ✍️
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="n"&gt;supplemental_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;articles&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;supplemental_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;supplemental_text_pp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;supplemental_text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; ... &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;supplemental_text_pp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;supplemental_text&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: using lookup - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; - found secondary source article &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                              &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(extract displayed) - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;supplemental_text_pp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we have successfully retrieved additional data from the Wiki, we truncate the response if it is over 150 characters and set &lt;code&gt;supplemental_text_pp&lt;/code&gt; to &lt;/p&gt;




&lt;h2&gt;
  
  
  Call Extract Again With New Information 📞
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;new_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;supplemental_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company founding date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;update: reviewed second source article - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using the new information retrieved from Wiki, we run the same extraction on the model again.&lt;/p&gt;




&lt;h2&gt;
  
  
  Print Response If Found 🖨️
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_founding_date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;new_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_founding_date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: success - found the answer - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we find the company founding date after incorporating the new information, we print the result.&lt;/p&gt;




&lt;h2&gt;
  
  
  Fully Integrated Code 🧑‍💻
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ModelCatalog&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llmware.parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WikiParser&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BEAVERTON, Ore.--(BUSINESS WIRE)--NIKE, Inc. (NYSE:NKE) today reported fiscal 2024 financial results for its &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;third quarter ended February 29, 2024.) “We are making the necessary adjustments to drive NIKE’s next chapter &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;of growth Post this Third quarter revenues were slightly up on both a reported and currency-neutral basis* &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;at $12.4 billion NIKE Direct revenues were $5.4 billion, slightly up on a reported and currency-neutral basis &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NIKE Brand Digital sales decreased 3 percent on a reported basis and 4 percent on a currency-neutral basis &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Wholesale revenues were $6.6 billion, up 3 percent on a reported and currency-neutral basis Gross margin &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;increased 150 basis points to 44.8 percent, including a detriment of 50 basis points due to restructuring charges &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Selling and administrative expense increased 7 percent to $4.2 billion, including $340 million of restructuring &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;charges Diluted earnings per share was $0.77, including $0.21 of restructuring charges. Excluding these &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;charges, Diluted earnings per share would have been $0.98* “We are making the necessary adjustments to &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;drive NIKE’s next chapter of growth,” said John Donahoe, President &amp;amp; CEO, NIKE, Inc. “We’re encouraged by &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;the progress we’ve seen, as we build a multiyear cycle of new innovation, sharpen our brand storytelling and &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;work with our wholesale partners to elevate and grow the marketplace.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;not_found_then_triage_lookup&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Not Found Example - if info not found, then lookup in another source.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;extract_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company founding date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;dict_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extract_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;

    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelCatalog&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slim-extract-tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;extract_key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;llm_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: first text reviewed for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;extract_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; - llm response: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;llm_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dict_key&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;llm_response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

        &lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dict_key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

            &lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: found the &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;extract_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; value - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;

        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: did not find the target value in the text - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: initiating a secondary process to try to find the information&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;extract&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;company_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;update: found the company name - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; - now using to lookup in secondary source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;WikiParser&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;add_wiki_topic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;target_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

                        &lt;span class="n"&gt;supplemental_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;articles&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;supplemental_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;supplemental_text_pp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;supplemental_text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; ... &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;supplemental_text_pp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;supplemental_text&lt;/span&gt;

                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: using lookup - &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;company_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; - found secondary source article &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                              &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(extract displayed) - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;supplemental_text_pp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                        &lt;span class="n"&gt;new_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;function_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;supplemental_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company founding date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;update: reviewed second source article - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_founding_date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;new_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                            &lt;span class="n"&gt;company_founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company_founding_date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update: success - found the answer - &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;company_founding_date&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="n"&gt;founding_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;not_found_then_triage_lookup&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You may also find the fully integrated code on our github &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/SLIM-Agents/not_found_extract_with_lookup.py" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Additionally, the notebook version (ipynb) is available &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Notebooks/NoteBook_Examples/not_found_extract_with_lookup.ipynb" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion 🤖
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/rSVRXeKPgeM5xfGyCR/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/rSVRXeKPgeM5xfGyCR/giphy.gif" alt="Men In Kilts" width="600" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Handling 'NOT FOUND' answers is one of the hardest problems in RAG, but it's a challenge that can be mitigated with thoughtful design. By implementing techniques like broader lookups, LLMWare aims to enhance the overall user experience and reliability of its AI systems.&lt;/p&gt;

&lt;p&gt;Please check out our Github and leave a star! &lt;a href="https://github.com/llmware-ai/llmware" rel="noopener noreferrer"&gt;https://github.com/llmware-ai/llmware&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow us on Discord here: &lt;a href="https://discord.gg/MgRaZz2VAB" rel="noopener noreferrer"&gt;https://discord.gg/MgRaZz2VAB&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please be sure to visit our website &lt;a href="//llmware.ai"&gt;llmware.ai&lt;/a&gt; for more information and updates.&lt;/p&gt;

</description>
      <category>rag</category>
      <category>ai</category>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>Are we all prompting wrong? Balancing Creativity and Consistency in RAG.</title>
      <dc:creator>Simon Risman</dc:creator>
      <pubDate>Mon, 17 Jun 2024 18:44:07 +0000</pubDate>
      <link>https://forem.com/llmware/are-we-all-prompting-wrong-balancing-creativity-and-consistency-in-rag-20fm</link>
      <guid>https://forem.com/llmware/are-we-all-prompting-wrong-balancing-creativity-and-consistency-in-rag-20fm</guid>
      <description>&lt;p&gt;For a Boston native like myself, there are few things more heartwarming than Artificial Intelligence understanding the brilliance of &lt;em&gt;Good Will Hunting&lt;/em&gt;. A few cursory prompts reveal that it views it as a "must-watch tale of redemption and self discovery". &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figxw6cfugs9mb2ivue1i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figxw6cfugs9mb2ivue1i.png" alt="Chat Will Hunting" width="800" height="683"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But a slightly closer look reveals what many users of LLMs have accepted as a given - slight variations on an otherwise consistent topic. This is the result of Stochastic Generation. &lt;/p&gt;

&lt;h2&gt;
  
  
  Stochastic generation 🤖
&lt;/h2&gt;

&lt;p&gt;This is a fairly common term, from online bootcamps to college lectures, students of AI are familiar with this concept. For those who need a quick refresher, here is the 3-step generation loop that many LLMs follow. &lt;/p&gt;

&lt;p&gt;LLMs are trained using a next-token prediction task, where the model predicts the next token in a sequence based on the previous tokens. This process involves:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Tokenized Input&lt;/strong&gt;: The input text is converted into a sequence of numbers (tokens).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Probability Distribution&lt;/strong&gt;: The model generates a probability distribution over the possible next tokens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sampling Algorithm&lt;/strong&gt;: This distribution is passed through a sampling algorithm to select the next token.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The probabilistic elements that this process introduces enables LLMs to generate more captivating dialogue, novel images, and creatively praise award-winning films. &lt;br&gt;
&lt;a href="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExbHpta3l3aXZicWp4dDhjbjc1b3p5MjVmMnZvcGQ2d3FqMzNnZDF2ZyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/7pLv68ItwBaHS/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExbHpta3l3aXZicWp4dDhjbjc1b3p5MjVmMnZvcGQ2d3FqMzNnZDF2ZyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/7pLv68ItwBaHS/giphy.gif" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Randomness and RAG 🎰
&lt;/h2&gt;

&lt;p&gt;When building RAG based applications, we are often not as concerned with creativity as we are with facts. When dealing with facts, we want as little probability involved as possible. In other words, instead of sampling a probability distribution, its beneficial to just take the token with the maximum likelihood every time. &lt;/p&gt;

&lt;p&gt;LLMWARE allows you to explore how random your generated results are, as well as augment how random you want them to be. Heres a quick demonstration:&lt;/p&gt;
&lt;h2&gt;
  
  
  Demo 🙌
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Load the model&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelCatalog&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;load_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bling-stablelm-3b-tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                  &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                  &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                  &lt;span class="n"&gt;get_logits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                  &lt;span class="n"&gt;max_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the load_model method, we make a few important selections. The bling 3B is one of our newest and highest performing models. &lt;/p&gt;

&lt;p&gt;Setting the sample attribute to True or False will allow you to change between a stochastic approach and a top-token model. &lt;/p&gt;

&lt;p&gt;The temperature can be an important tool to control the randomness of the output, with lower values making responses more focused and higher values increasing diversity in the generated text.&lt;/p&gt;

&lt;p&gt;These key settings will allow you to see what kind of approach you want to take when it comes to the probabilistic nature of your model. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Run a simple inference model on some sample text&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;inference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is a list of the key points?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sample&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This step is where your model is doing the heavy lifting, analyzing and summarizing the loaded-in documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Run a sampling analysis&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;sampling_analysis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ModelCatalog&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;analyze_sampling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sampling analysis: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sampling_analysis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you get to see the analytics - giving you a better idea of how heavily your model samples from the lower-probability side of the distribution.&lt;/p&gt;

&lt;p&gt;This analysis will include what percentage of the tokens selected by the model were also the highest probability output, and will note cases where the not-top-token was selected. &lt;/p&gt;

&lt;p&gt;In cases where the top token was not selected, the below code will print out the exact entries of the outputs, including their token rank.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sampling_analysis&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;not_top_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sampled choices: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All these tools can help you make an informed decision on whether you want your model to think a little outside the box, or stick to the most likely answer. To see this process in action, check out our youtube video on consistent LLM output generation.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/iXp1tj-pPjM"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;The full code for this example can be found in our &lt;a href="https://github.com/llmware-ai/llmware/blob/main/examples/Models/adjusting_sampling_settings.py" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you have any questions, or would like to learn more about LLMWARE, come to our Discord community. Click &lt;a href="https://discord.gg/6nNVdn7A" rel="noopener noreferrer"&gt;here&lt;/a&gt; to join. See you there!🚀🚀🚀&lt;/p&gt;

&lt;p&gt;Please be sure to visit our website &lt;a href="https://llmware.ai/" rel="noopener noreferrer"&gt;llmware.ai&lt;/a&gt; for more information and updates.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>python</category>
      <category>rag</category>
    </item>
  </channel>
</rss>
