<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Phil Nash</title>
    <description>The latest articles on Forem by Phil Nash (@philnash).</description>
    <link>https://forem.com/philnash</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1359%2F3daf053e-eade-4edb-a423-1c3c3a318ef9.jpg</url>
      <title>Forem: Phil Nash</title>
      <link>https://forem.com/philnash</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/philnash"/>
    <language>en</language>
    <item>
      <title>5 quick tips for giving better presentations</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Thu, 05 Mar 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/philnash/5-quick-tips-for-giving-better-presentations-2j0f</link>
      <guid>https://forem.com/philnash/5-quick-tips-for-giving-better-presentations-2j0f</guid>
      <description>&lt;p&gt;I have been &lt;a href="https://philna.sh/speaking" rel="noopener noreferrer"&gt;speaking publicly&lt;/a&gt; at developer conferences for over a decade and in that time I've seen plenty of other people giving talks. Everyone gives talks differently, and if you are a speaker or thinking about getting into it, I encourage you to work on developing your unique style as well as your content.&lt;/p&gt;

&lt;p&gt;However, recently I have noticed a few little things that some speakers do and others do not that make for a better experience for both the audience and the presenter. They are mostly inconsequential for the content of a talk, but they can act as speed bumps that take the momentum out of a talk.&lt;/p&gt;

&lt;p&gt;So, in no particular order, here are a few things you should remember when giving a talk.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finish strong
&lt;/h2&gt;

&lt;p&gt;Have you ever seen this situation play out?&lt;/p&gt;

&lt;p&gt;A speaker wraps up their talk nicely with an insightful conclusion. Then they say something like, "Thank you very much. Any questions?"&lt;/p&gt;

&lt;p&gt;The audience, prepared to bring the house down with cheering and clapping are now caught like a deer in headlights. Instead of rapturous applause an awkward silence descends. Someone puts their hand up and asks a question. Q&amp;amp;A ensues, but everything now feels a bit flat.&lt;/p&gt;

&lt;p&gt;As an audience member, you want to show appreciation for a great job, so as a speaker you should give space for that. When concluding a talk, end with a "Thank you" or some other definitive way to indicate you are finished speaking, and let the crowd have their moment to thank you. Once the applause dies down, that's when you, or even better an MC or moderator, can indicate that it is time (or not) for questions.&lt;/p&gt;

&lt;p&gt;Don't let a talk fizzle out into the Q&amp;amp;A portion, finish on a high. Deal with the questions afterwards.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take your lanyard off
&lt;/h2&gt;

&lt;p&gt;The conference lanyard is the sign that you are part of this temporary tribe that has formed. When wearing it, you can speak to anyone else also adorned in the conference colours knowing you have some common ground.&lt;/p&gt;

&lt;p&gt;When you are on stage to deliver a talk, you no longer need the lanyard to signify you are part of the experience. But that's not why you should take it off. You should remove it for the duration of your talk simply because it gets in the way.&lt;/p&gt;

&lt;p&gt;On stage there is more to consider. A lanyard hanging around your neck may get in the way of your laptop if you need to use it, it might bother the microphone cable, it might just not look that great. You probably didn't choose your outfit based on the conference colour scheme, so why throw a dangling colour clash into the mix?&lt;/p&gt;

&lt;p&gt;Take your lanyard off for the talk, but don't forget to put it back on afterwards. You will want to help people remember who you are once you're no longer in the spotlight.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keyboard shortcuts you wish you'd known for years
&lt;/h2&gt;

&lt;p&gt;There are two macOS keyboard shortcuts that I think are vital to know if you want to make smooth transitions between parts of your talk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cmd + F1
&lt;/h3&gt;

&lt;p&gt;Let's say you're presenting in Keynote and you're using the presenter view; you can see the notes on your laptop, the audience sees the slides on the screen. Your laptop is using the external screen as an extended display. Now you want to demo something. To do this you need to switch to mirroring, so that you can see the same demo on your laptop as the audience is seeing on the screen.&lt;/p&gt;

&lt;p&gt;You could make this switch by opening the display settings and changing the setting. However, the keyboard shortcut is &lt;code&gt;Cmd + F1&lt;/code&gt;. Flipping between mirroring and extending with a keypress is quicker, smoother and keeps the energy in your presentation. &lt;code&gt;Cmd + F1&lt;/code&gt;, commit it to memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cmd + Shift + F
&lt;/h3&gt;

&lt;p&gt;If you're using Chrome and you want to go into fullscreen (with the green button in the top left of the window, or &lt;code&gt;Shift + Fn + F&lt;/code&gt;) and you find that the address bar is still showing, taking up valuable screen real estate, that can be fixed. &lt;code&gt;Cmd + Shift + F&lt;/code&gt; toggles whether the address bar shows. You're probably reading this in Chrome right now, try it out!&lt;/p&gt;

&lt;h2&gt;
  
  
  Resize the font, don't zoom the window
&lt;/h2&gt;

&lt;p&gt;If you plan to live code or just show code within an IDE as part of a demo, you should make sure that the font is readable by the audience. I recommend doing this as part of a tech check before you are supposed to go on stage. This gives you the time to check things yourself and resize the text correctly.&lt;/p&gt;

&lt;p&gt;You might think that zooming in, with the shortcut &lt;code&gt;Cmd and +&lt;/code&gt; will help the audience see the code. In VS Code and other similar editors, that zooms the text and the entire interface. By the time the text is of a size that the audience will be able to read it, it will be crowded out by the file explorer and the terminal.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8smj9m3fxilyxxnyvs9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8smj9m3fxilyxxnyvs9.gif" alt="Zooming in and out of VS Code just makes it harder to see the text." width="480" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Instead, take the time to open the settings (&lt;code&gt;Cmd + ,&lt;/code&gt;) and change the font size to something readable. That way the rest of the interface stays out of the way. If you plan to show things in the built-in terminal, make sure to increase the terminal font size too, it's a different setting.&lt;/p&gt;

&lt;p&gt;Walk to the back of the room, or get a friend to check and give you a thumbs up, and make sure things are readable before you start the talk.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trust in the tech
&lt;/h2&gt;

&lt;p&gt;If you are in the fortunate position to be speaking at an event with a crew looking after you on stage, they will likely give you a microphone before you go on and be in control of it. Most of the time this means that the microphone is on and you needn't touch it, but it is muted at the sound desk. Then, when it is time for you to start they will unmute you and everyone will hear you.&lt;/p&gt;

&lt;p&gt;It can be unnerving to believe that once you start speaking everyone will be able to hear you, but you should. Much like my first tip to finish strong, you also want to start strong, and "Hello, can you hear me? Is this on?" is not the way to achieve that.&lt;/p&gt;

&lt;p&gt;Instead start by introducing yourself, start with a joke, start by thanking the audience for showing up. However you want to start, assume you will be heard and confidently start speaking. If something does go wrong, it's not your fault (unless you turned your microphone off on purpose). When things go right, you will capture the audience's attention and kick your session off in style.&lt;/p&gt;

&lt;h2&gt;
  
  
  Small tips, big impact
&lt;/h2&gt;

&lt;p&gt;I have, of course, done the opposite of all of these things myself. I've started by asking if I can be heard, zoomed my editor until only the side panel was visible, fiddled in settings and menus to make my screen show the right thing, flapped my lanyard around on stage, and finished with "Thank you, any questions?" and a silent room. I only hope by sharing some of these tips that you can avoid all of those things yourself, sail through your talk with all your energy being used to wow the audience, and feel great at the end of it all.&lt;/p&gt;

&lt;p&gt;So remember your shortcuts, take that lanyard off, trust that you will be heard, get your font sizes right, start with confidence and finish strong. I can't wait to see what you are going to talk about.&lt;/p&gt;

</description>
      <category>speaking</category>
      <category>devrel</category>
      <category>conferences</category>
    </item>
    <item>
      <title>Things you need to do for npm trusted publishing to work</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Sat, 31 Jan 2026 00:00:00 +0000</pubDate>
      <link>https://forem.com/philnash/things-you-need-to-do-for-npm-trusted-publishing-to-work-46i2</link>
      <guid>https://forem.com/philnash/things-you-need-to-do-for-npm-trusted-publishing-to-work-46i2</guid>
      <description>&lt;p&gt;After the recent supply chain attacks on the npm ecosystem, notaby the &lt;a href="https://securitylabs.datadoghq.com/articles/shai-hulud-2.0-npm-worm/" rel="noopener noreferrer"&gt;Shai-Hulud 2.0 worm&lt;/a&gt;, &lt;a href="https://github.blog/security/supply-chain-security/our-plan-for-a-more-secure-npm-supply-chain/" rel="noopener noreferrer"&gt;GitHub took a number of actions&lt;/a&gt; to shore up the security of publishing packages to hopefully avoid further attacks. One of the outcomes was that long-lived npm tokens were revoked in favour of short-lived tokens or using trusted publishing.&lt;/p&gt;

&lt;p&gt;I have GitHub Actions set up to publish new versions of npm packages that I maintain when a new tag is pushed to the repository. This workflow used long-lived npm tokens to authenticate, so when it came to updating a package recently I needed to update the publishing method too. The &lt;a href="https://docs.npmjs.com/trusted-publishers" rel="noopener noreferrer"&gt;npm documentation on trusted publishing for npm packages&lt;/a&gt; was useful to a point, but there were some things I needed to do that the docs either didn't cover explicitly or weren't obvious enough, to get my package published successfully. I also came across &lt;a href="https://github.com/npm/cli/issues/8730" rel="noopener noreferrer"&gt;this thread on GitHub where other people had similar issues&lt;/a&gt;. I wanted to share those things here.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Briefly, the changes that worked for me were to add the following to my GitHub Action publishing workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Permission to generate an OIDC token&lt;/span&gt;
&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;publish&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="s"&gt;...&lt;/span&gt;
      &lt;span class="s"&gt;# Ensure the latest npm is installed&lt;/span&gt;
      &lt;span class="s"&gt;- run&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm install -g npm@latest&lt;/span&gt;
      &lt;span class="s"&gt;...&lt;/span&gt;
      &lt;span class="s"&gt;# Add the --provenance flag to the publish command&lt;/span&gt;
      &lt;span class="s"&gt;- run&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm publish --provenance&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And ensure that the &lt;em&gt;package.json&lt;/em&gt; refers to the correct repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"git"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"git+https://github.com/${username}/${packageName}.git"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a bit more detail and alternative ways to set some of these settings, read on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Package settings
&lt;/h2&gt;

&lt;p&gt;Ok, so this is embarrassing, but initially I couldn't find the settings I needed to enable trusted publishing. The npm docs say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Navigate to your package settings on npmjs.com and find the "Trusted Publisher" section.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I spent far too long looking around the &lt;code&gt;https://www.npmjs.com/settings/${username}/packages&lt;/code&gt; page for the "Trusted Publisher" section. What I needed was the specific package settings, available here: &lt;code&gt;https://www.npmjs.com/package/${packageName}/access&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You need to set up trusted publishing for each of your packages individually. That might be fine if you only maintain a few, it's going to be a huge hassle if you have a lot.&lt;/p&gt;

&lt;p&gt;Once you have filled in the trusted publisher settings, then its on to updating your project so that it can be published successfully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permissions
&lt;/h2&gt;

&lt;p&gt;This is in the npm docs, so I'm just including it for completeness. You need to &lt;a href="https://docs.npmjs.com/trusted-publishers#github-actions-configuration" rel="noopener noreferrer"&gt;give the workflow permission to generate an OIDC token&lt;/a&gt; that it can then use to publish the package. To do this requires one permission being set in your workflow file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  npm version
&lt;/h2&gt;

&lt;p&gt;The docs clearly call out that:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: Trusted publishing requires &lt;a href="https://docs.npmjs.com/cli/v11" rel="noopener noreferrer"&gt;npm CLI&lt;/a&gt; version 11.5.1 or later.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I needed to upgrade the version of npm used by my GitHub Actions workflow, so I added a simple step to install the latest version of npm as part of the run before publishing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm install -g npm@latest&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Automatic provenance
&lt;/h2&gt;

&lt;p&gt;The docs also say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When you publish using trusted publishing, npm automatically generates and publishes &lt;a href="https://docs.npmjs.com/generating-provenance-statements" rel="noopener noreferrer"&gt;provenance attestations&lt;/a&gt; for your package. This happens by default—you don't need to add the --provenance flag to your publish command.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I did not find this to be the case. I needed to add the &lt;code&gt;--provenance&lt;/code&gt; flag so that my package would publish successfully.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm publish --provenance&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This was something that &lt;a href="https://github.com/npm/cli/issues/8730#issuecomment-3538909998" rel="noopener noreferrer"&gt;seemed to help others&lt;/a&gt; too. You may only need to pass &lt;code&gt;--provenance&lt;/code&gt; the first time, with it continuing to work automatically beyond that, but it can't hurt to keep it in your publish script (for when you need to update another package and you copy things over).&lt;/p&gt;

&lt;p&gt;You can also set your package to generate provenance attestations on publishing by setting the &lt;a href="https://docs.npmjs.com/cli/v11/using-npm/config#provenance" rel="noopener noreferrer"&gt;&lt;code&gt;provenance&lt;/code&gt; option in &lt;code&gt;publishConfig&lt;/code&gt;&lt;/a&gt; in your &lt;em&gt;package.json&lt;/em&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"publishConfig"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"provenance"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or you can set the &lt;code&gt;NPM_CONFIG_PROVENANCE&lt;/code&gt; environment variable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;NPM_CONFIG_PROVENANCE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm publish&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Repository details
&lt;/h2&gt;

&lt;p&gt;Finally, I don't know if this last part helped as I did already have it set, but &lt;a href="https://github.com/npm/cli/issues/8730#issuecomment-3544738786" rel="noopener noreferrer"&gt;others in this GitHub thread&lt;/a&gt; found that setting the &lt;a href="https://docs.npmjs.com/cli/v11/configuring-npm/package-json#repository" rel="noopener noreferrer"&gt;&lt;code&gt;repository&lt;/code&gt; field in the package's &lt;em&gt;package.json&lt;/em&gt;&lt;/a&gt; to specifically point to the GitHub repository also helped.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"git"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"git+https://github.com/${username}/${packageName}.git"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you set up your trusted publisher in npm you do have to provide the repository details, so it makes sense to me that the package should agree with those details too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keep the ecosystem safe
&lt;/h2&gt;

&lt;p&gt;Short-lived tokens, trusted publishing, and provenance all help keep the entire ecosystem safe. If you've read this far, it is because you are also updating your packages to publish with this method&lt;/p&gt;

&lt;p&gt;I know there are people out there with many more packages, and packages that are much more popular than any of mine, but I hope this helps. It does amuse me that I went through this for &lt;a href="https://www.npmjs.com/package/@philnash/resend" rel="noopener noreferrer"&gt;a package that I'm pretty sure I'm the only user of&lt;/a&gt;, but at least I now know how to do it for the future.&lt;/p&gt;

&lt;p&gt;I hope to see trusted publishing continue to expand to more providers, it is limited to GitHub and GitLab at the time of writing, and to be used by more packages. And I hope to see fewer worms charging through the package ecosystem and threatening all of our applications in the future.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>npm</category>
      <category>node</category>
    </item>
    <item>
      <title>How wrong can a JavaScript Date calculation go?</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 14 Jan 2026 22:46:00 +0000</pubDate>
      <link>https://forem.com/philnash/how-wrong-can-a-javascript-date-calculation-go-o26</link>
      <guid>https://forem.com/philnash/how-wrong-can-a-javascript-date-calculation-go-o26</guid>
      <description>&lt;p&gt;The &lt;code&gt;Date&lt;/code&gt; object in JavaScript is frequently one that causes trouble. So much so, it is set to be replaced by &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Temporal" rel="noopener noreferrer"&gt;&lt;code&gt;Temporal&lt;/code&gt;&lt;/a&gt; soon. This is the story of an issue that I faced that will be much easier to handle once &lt;code&gt;Temporal&lt;/code&gt; is more widespread.&lt;/p&gt;

&lt;h2&gt;
  
  
  The issue
&lt;/h2&gt;

&lt;p&gt;In January 2025 I was in Santa Clara, California writing some JavaScript to perform some reporting. I wanted to be able to get a number of events that happened within a month, so I would create a date object for the first day of the month, add one month to it and then subtract a day to return the last day. Seems straightforward, right?&lt;/p&gt;

&lt;p&gt;I got a really weird result though. I reduced the issue to the following code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const date = new Date("2024-01-01T00:00:00.000Z");
date.toISOString();
// =&amp;gt; "2024-01-01T00:00:00.000Z" as expected
date.setMonth(1);
date.toISOString();
// =&amp;gt; "2023-03-04-T00:00:00.000Z" WTF?

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I added a month to the 1st of January 2024 and landed on the 4th March, 2023. What happened?&lt;/p&gt;

&lt;h2&gt;
  
  
  Times and zones
&lt;/h2&gt;

&lt;p&gt;You might have thought it was odd for me to set this scene on the West coast of the US, but it turned out this mattered. This code would have run fine in UTC and everywhere East of it.&lt;/p&gt;

&lt;p&gt;JavaScript dates are more than just dates, they are responsible for time as well. Even though I only wanted to deal with days and months in this example the time still mattered.&lt;/p&gt;

&lt;p&gt;I did know this, so I set the time to UTC thinking that this would work for me wherever I was. That was my downfall. Let's break down what happened.&lt;/p&gt;

&lt;p&gt;Midnight on the 1st January, 2024 in UTC is still 4pm on the 31st December, 2023 in Pacific Time (UTC-8). &lt;code&gt;date.setMonth(1)&lt;/code&gt; sets the date to February (as months are 0-indexed unlike days). But we started on 31st December, 2023 so JavaScript has to handle the non-existant date of 31st February, 2023. It does this by overflowing to the next month, so we get 3rd March. Finally, to print it out, the date is translated back into UTC, giving the final result: midnight on 4th March, 2023.&lt;/p&gt;

&lt;p&gt;All of these steps feel reasonable when you break it down, the confusion stems from how unexpected that result was.&lt;/p&gt;

&lt;p&gt;So, how do you fix this?&lt;/p&gt;

&lt;h2&gt;
  
  
  Always use UTC
&lt;/h2&gt;

&lt;p&gt;Since I didn't actually care for the time and I knew I wanted to work with UTC, I fixed this code using the &lt;code&gt;Date&lt;/code&gt; object's &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/setUTCMonth" rel="noopener noreferrer"&gt;&lt;code&gt;setUTCMonth&lt;/code&gt; method&lt;/a&gt;. My original code subtracted a day to get the last day in a month, so I used the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/setUTCDate" rel="noopener noreferrer"&gt;&lt;code&gt;setUTCDate&lt;/code&gt; method&lt;/a&gt; too. All &lt;code&gt;set${timePeriod}&lt;/code&gt; methods have a &lt;code&gt;setUTC${timePeriod}&lt;/code&gt; equivalent to help you work with this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const date = new Date("2024-01-01T00:00:00.000Z");
date.toISOString();
// =&amp;gt; "2024-01-01T00:00:00.000Z"
date.setUTCMonth(1);
date.toISOString();
// =&amp;gt; "2024-02-01-T00:00:00.000Z"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So this fixed my issue. Can it be better though?&lt;/p&gt;

&lt;h2&gt;
  
  
  Bring on Temporal
&lt;/h2&gt;

&lt;p&gt;One of the reasons this went wrong was because I was trying to manipulate dates, but I was actually manipulating dates and times without thinking about it. I mentioned &lt;code&gt;Temporal&lt;/code&gt; at the top of the post because it has objects specifically for this.&lt;/p&gt;

&lt;p&gt;If I was to write this code using &lt;code&gt;Temporal&lt;/code&gt; I would be able to use the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Temporal/PlainDate" rel="noopener noreferrer"&gt;&lt;code&gt;Temporal.PlainDate&lt;/code&gt;&lt;/a&gt; to represent a calendar date, a date without a time or time zone.&lt;/p&gt;

&lt;p&gt;This simplifies things already, but &lt;code&gt;Temporal&lt;/code&gt; also makes it more obvious how to manipulate dates. Rather than setting months and dates or adding milliseconds to update a date, you add a duration. You can either construct a duration with the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Temporal/Duration" rel="noopener noreferrer"&gt;&lt;code&gt;Temporal.Duration&lt;/code&gt; object&lt;/a&gt; or use an object that defines a duration.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Temporal&lt;/code&gt; also makes objects immutable, so every time you change a date it returns a new object.&lt;/p&gt;

&lt;p&gt;In this case I wanted to add a month, so with &lt;code&gt;Temporal&lt;/code&gt; it would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const startDate = Temporal.PlainDate.from("2024-01-01");
// =&amp;gt; Temporal.PlainDate 2024-01-01
const nextMonth = startDate.add({ months: 1 });
// =&amp;gt; Temporal.PlainDate 2024-02-01
const endDate = nextMonth.subtract({ days: 1 });
// =&amp;gt; Temporal.PlainDate 2024-01-31

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Date manipulation without worrying about times, wonderful!&lt;/p&gt;

&lt;p&gt;Of course, there are many more benefits to the very well throught out &lt;code&gt;Temporal&lt;/code&gt; API and I cannot wait for it to be a part of every JavaScript runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mind the time zone
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;Temporal&lt;/code&gt; is just rolling out to JavaScript engines. At the time of writing, it is available in Firefox and &lt;a href="https://developer.chrome.com/blog/new-in-chrome-144#temporal" rel="noopener noreferrer"&gt;just landed in Chrome 144&lt;/a&gt;. It's also listed as &lt;a href="https://caniuse.com/temporal" rel="noopener noreferrer"&gt;behind a flag in Safari Technical Preview&lt;/a&gt;. If you want to test this out open up Firefox or Chrome, or check out one of the polyfills &lt;a href="https://github.com/js-temporal/temporal-polyfill" rel="noopener noreferrer"&gt;@js-temporal/polyfill&lt;/a&gt; or &lt;a href="https://www.npmjs.com/package/temporal-polyfill" rel="noopener noreferrer"&gt;temporal-polyfill&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you still have to use &lt;code&gt;Date&lt;/code&gt; make sure you keep your time zone in mind. I'd try to move to, or at least learn how to use, &lt;code&gt;Temporal&lt;/code&gt; now.&lt;/p&gt;

&lt;p&gt;And watch out for time zones, even when you try to avoid them they can end up giving you a headache.&lt;/p&gt;

</description>
      <category>javascript</category>
    </item>
    <item>
      <title>Improve Your Python Search Relevancy with Astra DB Hybrid Search</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 30 Apr 2025 00:44:04 +0000</pubDate>
      <link>https://forem.com/datastax/improve-your-python-search-relevancy-with-astra-db-hybrid-search-11d6</link>
      <guid>https://forem.com/datastax/improve-your-python-search-relevancy-with-astra-db-hybrid-search-11d6</guid>
      <description>&lt;p&gt;Astra DB now supports hybrid search, which can increase the accuracy of your search by up to 45%. It does this by performing both vector search and BM25 keyword search and then reranking the results from both to return the most relevant results.&lt;/p&gt;

&lt;p&gt;In this post, we'll take a look at how to use Astra DB Hybrid Search in Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is hybrid search?
&lt;/h2&gt;

&lt;p&gt;Before we get to the code, let's go over what hybrid search actually is and why it helps. You would typically build a retrieval-augmented generation (RAG) app by &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-python?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;creating vector embeddings&lt;/a&gt; for your unstructured content and storing them in a database. Then, when a user makes a query, you turn the query into a vector embedding and use it to perform a similarity search to return relevant context that you can provide to a large language model (LLM) to generate an answer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphtg571c6x9xj038bun8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphtg571c6x9xj038bun8.png" alt="A flow chart showing the ingestion and generation phases of a RAG application." width="680" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The more accurate and relevant your search results from your database are, the better your RAG application will be. With better context, there’s less opportunity for the LLM to return inaccurate or hallucinated responses.&lt;/p&gt;

&lt;p&gt;To improve on the relevancy of this system, we need to focus on the search element. Vector search is great at understanding context and meaning, but it can miss results that would be returned from a keyword match. Meanwhile, keyword search can be restrictive as it doesn't understand context. Performing both searches gives us the best chance of returning the top results, but you then need to combine those results so you can pass them to an LLM. This is where reranking comes in.&lt;/p&gt;

&lt;p&gt;Reranking is performed by another machine learning model—a cross-encoder—that more accurately scores relevance because the model uses both the original query and the document to create the score. You can't use reranking models for search because it would require scoring every document in your database against the query every time; for small subsets of your data, however, this is achievable.&lt;/p&gt;

&lt;p&gt;You can actually use a reranker to help improve vector search results, by returning more results than required, reranking to adjust the order, then returning the top results.&lt;/p&gt;

&lt;p&gt;In hybrid search, we use reranking to rescore the combination of results from the vector and keyword searches and pick the top, most relevant results from the output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fht94d2uo6qtox7fjr6dr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fht94d2uo6qtox7fjr6dr.png" alt="A flow chart showing how hybrid search works. From a database a vector search is performed and returns some results, and a keyword search is performed and returns different results. The results are then passed through a reranker and the output is a mix of the results representing the most relevant results." width="440" height="651"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Astra DB can now perform hybrid search by combining vector search and BM25 keyword search, then reranking using the &lt;a href="https://developer.nvidia.com/nemo-retriever/" rel="noopener noreferrer"&gt;NVIDIA NeMo Retriever&lt;/a&gt; reranking microservices (including the &lt;a href="https://build.nvidia.com/nvidia/llama-3_2-nv-rerankqa-1b-v2/modelcard" rel="noopener noreferrer"&gt;nvidia/llama-3.2-nv-rerankqa-1b-v2 reranking model&lt;/a&gt;). Let's take a look at how to use Astra DB hybrid search to improve search relevancy in your Python application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hybrid Search in Python with Astra DB
&lt;/h2&gt;

&lt;p&gt;Let's start by creating a database in your DataStax account. While it’s provisioning, let's get our coding environment set up.&lt;/p&gt;

&lt;p&gt;To use Hybrid Search in Python, you’ll need to install version 2 of &lt;a href="https://github.com/datastax/astrapy" rel="noopener noreferrer"&gt;astrapy&lt;/a&gt; as well as &lt;a href="https://github.com/theskumar/python-dotenv" rel="noopener noreferrer"&gt;python-dotenv&lt;/a&gt; so that you can load environment variables from an &lt;em&gt;.env&lt;/em&gt; file. Install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"astrapy&amp;gt;=2.0,&amp;lt;3.0"&lt;/span&gt; python-dotenv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a file called &lt;em&gt;.env&lt;/em&gt; and add your database API endpoint, access token and choose a name for your collection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating a collection for hybrid search
&lt;/h3&gt;

&lt;p&gt;Once the database is created, we'll need to create a collection to store our data in. We'll do this in code, because we want to create some settings that aren't yet available in the dashboard.&lt;/p&gt;

&lt;p&gt;Create a file called &lt;em&gt;create_collection.py&lt;/em&gt; and add this code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy.info&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CollectionDefinition&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy.constants&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VectorMetric&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;collection_definition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;CollectionDefinition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_vector_dimension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_vector_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;VectorMetric&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DOT_PRODUCT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_vector_service&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nvidia&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NV-Embed-QA&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_lexical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokenizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;args&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lowercase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;porterstem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;asciifolding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;definition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collection_definition&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this code we create a definition for our collection and then create the collection. The definition includes details on how we want the collection to create vectors for our data as well as how it should treat the keyword search.&lt;/p&gt;

&lt;p&gt;For vector search, we are using &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize&lt;/a&gt; with the built-in NVIDIA NeMo Retriever nv-embed-qa model to create vector embeddings on insert and search. The model creates vectors with 1024 dimensions, and we configure the collection to use the &lt;a href="https://docs.datastax.com/en/dse/6.9/get-started/vector-concepts.html#dot-product-metric?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;dot product&lt;/a&gt; to calculate similarity between vectors.&lt;/p&gt;

&lt;p&gt;For the keyword search, the default performs exact keyword matching, but we can tweak this a bit with settings like this. First, we define the tokenizer, which is how the collection breaks up the text into words. We'll use the standard tokenizer, which divides based on word boundaries and strips out punctuation. We then add filters, which transform the text to make it easier to match searches. In this case, we add four filters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;lowercase -&lt;/strong&gt; converts all the text to lowercase&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;stop -&lt;/strong&gt; removes English stop words&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;porterstem -&lt;/strong&gt; applies the Porter Stemming algorithm for English, which translates different forms of words to a common stem, e.g. "search", "searches", and "searched" will all translate to the token "search"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;asciifolding -&lt;/strong&gt; translates characters into ASCII, that is it turns accented characters into an ASCII equivalent if it exists, e.g. "café" becomes "cafe"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that both the stop and porterstem filters are specific to English texts.&lt;/p&gt;

&lt;p&gt;You can choose to include the filters that will work best for your data. There is more on the &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/analyzers.html#supported-built-in-analyzers?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;available filters and links to further information in the Astra DB documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now we've created our collection, we can ingest some data to search against.&lt;/p&gt;

&lt;h3&gt;
  
  
  Indexing data for hybrid search
&lt;/h3&gt;

&lt;p&gt;Save this &lt;a href="https://gist.github.com/philnash/2525bb35cc83f7ec544c02e91fc3231f" rel="noopener noreferrer"&gt;list of made up restaurant descriptions&lt;/a&gt; that we'll use as our example data as a JSON file called &lt;em&gt;restaurants.json&lt;/em&gt;. Create a new file called &lt;em&gt;ingest.py&lt;/em&gt; and add the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;restaurants.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;restaurant_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;restaurants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;restaurant_data&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;restaurants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this code we load the restaurant descriptions and then create each as a document in Astra DB passing in the description as the &lt;code&gt;$hybrid&lt;/code&gt; property. Creating documents with the &lt;code&gt;$hybrid&lt;/code&gt; property does two things.&lt;/p&gt;

&lt;p&gt;It will use the NVIDIA NeMo Retriever embedding model that we configured when we created the collection to create vector embeddings of the content. This is the same as using Astra Vectorize to generate embeddings.&lt;/p&gt;

&lt;p&gt;It will also index the text for the new BM25 keyword search.&lt;/p&gt;

&lt;p&gt;Run the code with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python ingest.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check your collection in the DataStax dashboard, you should find both &lt;code&gt;$vectorize&lt;/code&gt; and &lt;code&gt;$lexical&lt;/code&gt; properties.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performing a hybrid search
&lt;/h3&gt;

&lt;p&gt;Having indexed using &lt;code&gt;$hybrid,&lt;/code&gt; we can now perform vector and hybrid searches against this collection. Create a file called &lt;em&gt;search.py&lt;/em&gt; and enter the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;salads&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;projection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will perform a vector search on the collection using Astra Vectorize when you run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python search.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run this search you will see five results. In position four is "The Green Leaf Eatery," which is the most "salads" sounding place on the list to me. Positions one and two do mention salads, and, because it is vector search and not keyword search, position three, "Fusion Flavors Bistro," doesn't mention salads at all.&lt;/p&gt;

&lt;p&gt;Now, let's update the search to use Hybrid Search and perform reranking on the results. You will need to change the find method to the new &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/document-methods/find-and-rerank.html#find-documents-with-a-hybrid-search?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;&lt;code&gt;find_and_rerank&lt;/code&gt; method&lt;/a&gt; and pass &lt;code&gt;{"$hybrid": query}&lt;/code&gt; as your sort field. You can add other arguments too, like &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/document-methods/find-and-rerank.html#limit-the-number-of-documents-returned-by-the-underlying-searches?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;hybrid_limits, which sets the number of documents to retrieve from each inner query before reranking&lt;/a&gt;, and &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/document-methods/find-and-rerank.html#include-the-scores-in-the-response?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;include_scores, which shows the various scores used to rank documents along the way&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_and_rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;salads&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hybrid_limits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;projection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;include_scores&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you will see these results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Green Leaf Eatery: A bright and airy vegetarian and vegan restaurant focusing on fresh, seasonal produce. Their innovative menu features creative plant-based dishes, from vibrant salads and grain bowls to hearty vegetable curries and decadent vegan desserts. It's a celebration of healthy and delicious eating."&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-3.6972656&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.67285335&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.03175403&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bohemian&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Brew&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bites:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;quirky&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;eclectic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;cafe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;offering&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;relaxed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;atmosphere&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;diverse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Enjoy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;gourmet&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sandwiches&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;artisanal&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bread&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creative&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;salads&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;house-made&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dressings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;selection&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;globally&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;inspired&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;small&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;plates.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Their&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;extensive&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;coffee&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;craft&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;beer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;makes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;perfect&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;spot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;casual&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bite&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;leisurely&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hangout.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-4.5507812&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.6813005&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.032002047&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Olive&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Grove&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Mediterranean:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Transport&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;yourself&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sunny&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;shores&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Mediterranean&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;at&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;charming&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;restaurant.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Their&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;features&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavorful&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Greek&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Turkish&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dishes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;grilled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;kebabs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;savory&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;spanakopita&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creamy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hummus&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;vibrant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;salads.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Enjoy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;fresh&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;herbs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;olive&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;oil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sun-drenched&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavors.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-5.1210938&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.68404347&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.032786883&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Fusion&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Flavors&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bistro:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;contemporary&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;restaurant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creatively&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;blends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;different&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;culinary&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;traditions.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Expect&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;unexpected&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;exciting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavor&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;innovative&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;presentations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;constantly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;evolves.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;place&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;adventurous&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;palates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seeking&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;unique&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dining&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;experience.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-11.375&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.67336804&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.015873017&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Farmhouse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Kitchen:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;rustic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;charming&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;restaurant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;celebrating&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bounty&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;local&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;farm.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Their&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;changes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seasonally&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;featuring&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dishes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;made&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;freshest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ingredients&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sourced&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;directly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;nearby&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;farms.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Expect&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;simple&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;yet&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;elegant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;preparations&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;highlight&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;natural&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavors&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ingredients.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-11.375&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.6582356&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.014925373&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this output, you can see the results and also the various scores that were used to rank them. You can see that "The Green Leaf Eatery" now ranks first on the list having been ranked in fourth by vector search and second by the BM25 search. The reranker lifted it up to first place.&lt;/p&gt;

&lt;p&gt;There are other similar movements in the list, plus in the fifth position was a restaurant that was initially ranked seventh by the vector search and doesn't contain the search term "salads." Hybrid Search initially returns more results than we need, reranks them and then returns the most relevant, so this result was lifted up into a position to be returned. Positions four and five also received the same rerank score, so were placed in their order based on one more score that is calculated, reciprocal rank fusion (RRF). &lt;a href="https://www.datastax.com/blog/reranker-algorithm-showdown-vector-search?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;RRF isn't great for reranking&lt;/a&gt;, but is very quick, so is useful to help with tie-breaks here.&lt;/p&gt;

&lt;p&gt;Try running vector and hybrid searches with other search terms to get a feel for the results. In our testing, we’ve seen &lt;a href="https://www.datastax.com/blog/introducing-astra-db-hybrid-search?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Hybrid Search improve relevance by up to 45%&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Next, we'll take a look at a couple of other things you will need to consider when using Hybrid Search.&lt;/p&gt;

&lt;h2&gt;
  
  
  Providing your own vectors
&lt;/h2&gt;

&lt;p&gt;The example above used Astra Vectorize to automatically create vector embeddings, but you can always use a different model and provide your own vectors.&lt;/p&gt;

&lt;p&gt;If you do use your own vector embedding model, then you will need to provide both the vector and the text that will be indexed for keyword search. You can do this with the special property &lt;code&gt;$lexical&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Imagine you have a method that &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-python?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;creates a vector embedding&lt;/a&gt; called &lt;code&gt;create_embedding&lt;/code&gt;. You might then ingest the data like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;restaurants.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;restaurant_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;restaurants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;create_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$lexical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;restaurant_data&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;restaurants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, when you perform a hybrid search, you need to provide a &lt;code&gt;$vector&lt;/code&gt; with which to search. Also, the default property on which the content is reranked is &lt;code&gt;$vectorize&lt;/code&gt;, so you need to tell the database which property to rerank on too.&lt;/p&gt;

&lt;p&gt;You also need to set the query that you want to use to perform the reranking. It can be the same query that you use for the vector search and the keyword search, or something else. You can see more about using different searches below.&lt;/p&gt;

&lt;p&gt;You can define the query with the &lt;code&gt;rerank_query&lt;/code&gt; argument and the field on which to perform the reranking with the r&lt;code&gt;erank_on&lt;/code&gt; argument. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;salad&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_and_rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;create_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$lexical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;rerank_query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rerank_on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hybrid_limits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Performing different searches
&lt;/h2&gt;

&lt;p&gt;You can also use different terms to perform your initial searches. This is useful because BM25 keyword search acts as a filter on the query keywords.&lt;/p&gt;

&lt;p&gt;In our Hybrid Search example above, only three restaurant descriptions mentioned "salads" so only three results had a &lt;code&gt;$bm25Rank&lt;/code&gt; in the results.&lt;/p&gt;

&lt;p&gt;That worked fine for our example, but when we're dealing with a RAG application, the search queries are often in natural language rather than keyword focused. We already set up our collection to use word stems and translate accented characters into ASCII. You may also want to perform keyword extraction, using &lt;a href="https://spotintelligence.com/2022/12/13/keyword-extraction/" rel="noopener noreferrer"&gt;something like NLTK, SpaCy or keyBERT&lt;/a&gt;, on the user query so you can then use the keywords for the lexical search. This would look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m looking for a restaurant that serves the best salad&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_and_rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;create_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$lexical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;extract_keywords&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;rerank_query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rerank_on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hybrid_limits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above code will now perform the vector search with your own vector embedding model, keyword search using keywords extracted from the user query and then rerank the results based on the initial query.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try hybrid search for better search and RAG relevancy
&lt;/h2&gt;

&lt;p&gt;Combining vector search with keyword search and a reranking model like NVIDIA NeMo Retriever nvidia/llama-3.2-nv-rerankqa-1b-v2 produces more relevant results, improving the output of your RAG application. You can get started with hybrid search and reranking in Astra DB today by &lt;a href="https://astra.datastax.com/?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;signing up&lt;/a&gt; and using AstraPy or with &lt;a href="https://docs.langflow.org/components-vector-stores#hybrid-search" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want to chat more about improving retrieval accuracy, drop into the &lt;a href="https://discord.gg/datastax" rel="noopener noreferrer"&gt;DataStax Devs Discord&lt;/a&gt; or drop me an email at &lt;a href="mailto:phil.nash@datastax.com"&gt;phil.nash@datastax.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>python</category>
      <category>vectordatabase</category>
      <category>rag</category>
      <category>genai</category>
    </item>
    <item>
      <title>Build a RAG Chat App with Firebase Genkit and Astra DB</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 16 Apr 2025 04:17:10 +0000</pubDate>
      <link>https://forem.com/datastax/build-a-rag-chat-app-with-firebase-genkit-and-astra-db-31n1</link>
      <guid>https://forem.com/datastax/build-a-rag-chat-app-with-firebase-genkit-and-astra-db-31n1</guid>
      <description>&lt;p&gt;Today &lt;a href="https://www.datastax.com/blog/introducing-astra-db-plugin-for-firebase-genkit?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;we announced the release of a plugin for Firebase's Genkit framework&lt;/a&gt; for building generative AI applications. &lt;a href="https://firebase.google.com/docs/genkit" rel="noopener noreferrer"&gt;Genkit&lt;/a&gt; is a powerful framework that provides the primitives for building production-quality GenAI applications. From easy access to models, prompts, indexers, and retrievers, to more advanced features like flows, traces, and evals, its power lies in making it easy to do the right thing while building GenAI applications.&lt;/p&gt;

&lt;p&gt;In this post, we'll take a look at how to use the Astra DB plugin for Genkit to build a &lt;a href="https://www.datastax.com/guides/what-is-retrieval-augmented-generation?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;retrieval-augmented generation application&lt;/a&gt; with Genkit.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/D7MqKnKtpnE"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a RAG application
&lt;/h2&gt;

&lt;p&gt;Let's build a RAG application from scratch and see how straightforward it can be with Genkit and Astra DB. First, you'll need a Gemini API key, which you can get from &lt;a href="https://aistudio.google.com/" rel="noopener noreferrer"&gt;Google AI Studio&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You’ll also need an Astra DB database to store your data and vectors; if you don't already have an account you can sign up for &lt;a href="https://astra.datastax.com/signup?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;a free DataStax account&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Start by creating a new Astra DB database; give it a name and choose a cloud and region. This takes a couple of minutes, so carry on with the next steps while it starts up.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxfptu0z6qyt393d7ai6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxfptu0z6qyt393d7ai6.png" alt="The " width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up the app
&lt;/h3&gt;

&lt;p&gt;Create a directory for your app and install the dependencies you'll need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;genkit-astra-db-rag
&lt;span class="nb"&gt;cd &lt;/span&gt;genkit-astra-db-rag
npm init &lt;span class="nt"&gt;--yes&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;genkit @genkit-ai/googleai genkitx-astra-db
npm &lt;span class="nb"&gt;install &lt;/span&gt;genkit-cli tsx &lt;span class="nt"&gt;-D&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a file to work in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;touch &lt;/span&gt;index.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;em&gt;index.ts&lt;/em&gt; and import the dependencies you installed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;genkit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Document&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;genkit&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;textEmbedding004&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;googleAI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gemini20Flash&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@genkit-ai/googleai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;astraDBIndexerRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;astraDBRetrieverRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;astraDB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;genkitx-astra-db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, we're pulling in &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api" rel="noopener noreferrer"&gt;Google's text-embedding-004 model&lt;/a&gt; for creating vector embeddings, and the &lt;a href="https://deepmind.google/technologies/gemini/flash/" rel="noopener noreferrer"&gt;Gemini Flash 2.0 model for generation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It's about time to create a collection in which we can store our vectors. Hopefully your database has been created now, so head to the &lt;a href="https://astra.datastax.com/?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;DataStax dashboard&lt;/a&gt;, choose your database, open the Data Explorer, and create a collection. Give the collection a name and choose "Bring my own" for the embedding generation method. The text-embedding-004 model creates vectors with 768 dimensions (though you can choose fewer), so enter 768 for the number of dimensions and choose "Cosine" for the similarity metric.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uhm65n6ylws36uhkm0j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uhm65n6ylws36uhkm0j.png" alt="The " width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you've created the collection, you'll need the API endpoint of the database, the collection name and to generate an API token.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa2a5citkb3zsvati0aq7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa2a5citkb3zsvati0aq7.png" alt="The database Overview tab. Highlighted on the right hand side of the page are the Database Details including the API endpoint a button for generating an Application Token." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With those, create a &lt;em&gt;.env&lt;/em&gt; file in your application and enter the credentials:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also in the &lt;em&gt;.env&lt;/em&gt; file, enter your API key from AI Studio too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can configure Genkit. In &lt;em&gt;index.ts&lt;/em&gt; create the &lt;code&gt;ai&lt;/code&gt; object like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;genkit&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nf"&gt;googleAI&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nf"&gt;astraDB&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;clientParams&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;applicationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;apiEndpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;collectionName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;embedder&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;textEmbedding004&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]),&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sets up Genkit with the Google AI plugin for models and embeddings and the Astra DB plugin, configured with the credentials to access the collection you just created and the vector embedding model &lt;em&gt;text-embedding-004&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;We can now access the Astra DB indexer and retriever via the reference functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;astraDBIndexer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;astraDBIndexerRef&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;astraDBRetriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;astraDBRetrieverRef&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The indexer is used to store documents in the collection and the retriever is used to perform vector search to return documents from the collection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ingesting data
&lt;/h2&gt;

&lt;p&gt;Now we can ingest some data into Astra DB. For this RAG application, let's grab data from the web. To ingest web data, we'll need to fetch it from a URL and then extract the main content from the returned HTML. I've written before about how I like to &lt;a href="https://www.datastax.com/blog/html-content-retrieval-augmented-generation-readability-js?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;use Readability.js to parse out the content from a page&lt;/a&gt;, so we'll follow that. We'll also need something to &lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;turn the content into chunks&lt;/a&gt;, let's use &lt;a href="https://www.npmjs.com/package/llm-chunk" rel="noopener noreferrer"&gt;llm-chunk&lt;/a&gt; for this as it's relatively simple.&lt;/p&gt;

&lt;p&gt;Install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @mozilla/readability jsdom llm-chunk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Import them at the top of the script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;import &lt;span class="o"&gt;{&lt;/span&gt; Readability &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"@mozilla/readability"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
import &lt;span class="o"&gt;{&lt;/span&gt; JSDOM &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"jsdom"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
import &lt;span class="o"&gt;{&lt;/span&gt; chunk &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"llm-chunk"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Write a function that takes a URL, fetches the HTML content, extracts the content and returns it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;fetchTextFromWeb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSDOM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next thing to do is write our first Genkit flow to ingest data from a URL into the collection. Flows are functions that you can run via the Genkit UI or through code. Flows have strongly defined input and output schemas using &lt;a href="https://zod.dev/" rel="noopener noreferrer"&gt;zod&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For this flow we'll accept a string which is a URL. There's no need for an output as the function will just end when it completes successfully.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;indexWebPage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;defineFlow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;indexPage&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;URL&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;outputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;extract-text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;fetchTextFromWeb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chunk-it&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;minLength&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxLength&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;indexer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;astraDBIndexer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ingestion pipeline is nice and easy to read as a flow. And using &lt;code&gt;ai.run&lt;/code&gt; around the non-Genkit functions provides an extra level of tracing that we'll be able to see later.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Genkit UI
&lt;/h3&gt;

&lt;p&gt;This seems like a good time to test out what we've built so far. Open &lt;em&gt;package.json&lt;/em&gt; and add a script to run your application code and one to start the Genkit server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tsx --env-file .env ./index.ts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"genkit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"genkit start -- npm start"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can run &lt;code&gt;npm run genkit&lt;/code&gt; and open the Genkit UI in your browser at &lt;a href="http://localhost:4000/" rel="noopener noreferrer"&gt;localhost:4000&lt;/a&gt;. You can either find your flow on the dashboard or by clicking on &lt;em&gt;Flows&lt;/em&gt; in the sidebar and then selecting it from the list.&lt;/p&gt;

&lt;p&gt;This gives you a box to add some input. The input is the schema that we set up as the parameters to the flow. In this case, it just expects a string that’s a URL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoabu5vbkb2fgy7ixoyj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoabu5vbkb2fgy7ixoyj.png" alt="The Genkit UI showing the input JSON that you can enter to run the flow we just built as well as the schema that it accepts." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Enter a URL and run the flow. Once it's complete, you can open the DataStax dashboard and see the chunks and their vectors stored in the collection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe144hgbblyyq7e67ktz8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe144hgbblyyq7e67ktz8.png" alt="After successfully ingesting the data through Genkit, the Data Explorer tab in the Datastax dashboard shows the Collection data including the vectors that were created." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Back in the Genkit UI you can click on &lt;em&gt;View trace&lt;/em&gt; and you’ll be shown each of the steps the flow took to fetch, chunk, embed and store the data.&lt;/p&gt;

&lt;p&gt;Head back to the Genkit dashboard and open Retrievers from the sidebar. All we did to define the available retriever was set up the Astra DB plugin and export the &lt;code&gt;astraDBRetrieverRef&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We can already use that retriever from the Genkit UI. Click on the retriever and enter the following in the input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"some search term"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the options, change the property k to 5. Run the retriever and it will perform a vector search using the text you provide in the input and returning five results from the database.&lt;/p&gt;

&lt;p&gt;We can now hook this up with a full RAG flow, in which we first retrieve context from the database and then pass it to a model to generate a response. Open the code again and define another flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ragFlow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;defineFlow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;rag&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;outputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;astraDBRetriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;gemini20Flash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
You are a helpful AI assistant that can answer questions.

Use only the context provided to answer the question.
If you don't know, do not make up an answer.

Question: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we use the retriever to search for the string input, and then pass the resulting documents as part of a prompt to the generate function that uses the Gemini Flash 2.0 model to perform the generation.&lt;/p&gt;

&lt;p&gt;Restart the Genkit server, open up the Flows section and choose your RAG flow. You can now input a question, make sure it's relevant to the data you indexed, and Gemini will generate a relevant response based on the docs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkyntz1qacmmzilzno5iu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkyntz1qacmmzilzno5iu.png" alt="In the Genkit UI you can run the RAG flow and see the output of the full pipeline." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once again, you can hit the &lt;em&gt;View trace&lt;/em&gt; button to see what happened at each stage in this request.&lt;/p&gt;

&lt;p&gt;We've only used these flows in the Genkit interface so far, but for either of the flows, you can run them like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;indexWebPage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;URL&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Genkit and Astra DB make RAG easy
&lt;/h2&gt;

&lt;p&gt;It took us fewer than 100 lines of code to build the two major flows required for RAG: ingestion and generation. Firebase Genkit made it easy to test our implementation as we went—without us having to build a UI for it. And the tracing in Genkit means it's easier to track down bugs in your flows.&lt;/p&gt;

&lt;p&gt;Astra DB is an easy to use and powerful vector database, and it's even easier to use when all you need to do is configure the plugin in Genkit and reference indexers and retrievers.&lt;/p&gt;

&lt;p&gt;You can find the &lt;a href="https://github.com/philnash/genkit-astra-db-rag" rel="noopener noreferrer"&gt;code for this app on GitHub&lt;/a&gt;. The &lt;a href="https://github.com/datastax/genkitx-astra-db" rel="noopener noreferrer"&gt;Astra DB plugin for Genkit&lt;/a&gt; is open source so if you have any issues or requests, please &lt;a href="https://github.com/datastax/genkitx-astra-db/issues" rel="noopener noreferrer"&gt;open an issue on the GitHub repo&lt;/a&gt;. And check out the &lt;a href="https://firebase.google.com/docs/genkit" rel="noopener noreferrer"&gt;Genkit docs&lt;/a&gt; for more on what you can build with Genkit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is Astra DB?
&lt;/h3&gt;

&lt;p&gt;Astra DB is a cloud-based NoSQL document store. It features an accurate and performant vector index for storing vectors which can be used for similarity searches. It comes with a Genkit plugin for integration with the Firebase Genkit framework.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Genkit?
&lt;/h3&gt;

&lt;p&gt;Genkit is a framework for building generative AI applications. It provides essential tools such as models, prompts, indexers, retrievers, flows, traces, and evaluations. By using Genkit, developers can efficiently create applications that leverage the power of generative AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I get started with building a RAG application using Genkit and Astra DB?
&lt;/h3&gt;

&lt;p&gt;To build a RAG application with Genkit and Astra DB, you need to create a database and collection within Astra DB and then install Genkit and related dependencies into your Node.js application. Once you've configured Genkit with your Astra DB credentials, you can start creating flows.&lt;/p&gt;

&lt;h3&gt;
  
  
  What does building the RAG application involve?
&lt;/h3&gt;

&lt;p&gt;Building the RAG application involves creating a collection in Astra DB to store vectors, and setting up flows in Genkit to ingest data, and to generate responses based on context retrieved from the database. You can test these flows out using the Genkit UI.&lt;/p&gt;

&lt;h6&gt;
  
  
  DataStax AI Platform:
&lt;/h6&gt;

&lt;p&gt;The Fastest Way to Build and Deploy AI Apps&lt;/p&gt;

&lt;p&gt;&lt;a href="https://astra.datastax.com/signup?type=langflow&amp;amp;utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Try For Free&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>firebase</category>
      <category>genai</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>How to Create Vector Embeddings in Python</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 09 Apr 2025 01:26:09 +0000</pubDate>
      <link>https://forem.com/datastax/how-to-create-vector-embeddings-in-python-3am0</link>
      <guid>https://forem.com/datastax/how-to-create-vector-embeddings-in-python-3am0</guid>
      <description>&lt;p&gt;When you’re building a &lt;a href="https://www.datastax.com/guides/what-is-retrieval-augmented-generation" rel="noopener noreferrer"&gt;retrieval-augmented generation (RAG)&lt;/a&gt; app, the first thing you need to do is prepare your data. You need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;collect your unstructured data&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datastax.com/blog/chunking-to-get-your-data-ai-ready?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;split it into chunks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;turn those chunks into &lt;a href="https://www.datastax.com/guides/what-is-a-vector-embedding?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;vector embeddings&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;store the embeddings in a &lt;a href="https://www.datastax.com/products/vector-search?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;vector database&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are many ways that you can create vector embeddings in Python. In this post, we’ll take a look at four ways to generate vector embeddings: locally, via API, via a framework, and with &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB's Vectorize&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local vector embeddings
&lt;/h2&gt;

&lt;p&gt;There are many &lt;a href="https://huggingface.co/models?pipeline_tag=sentence-similarity&amp;amp;sort=trending" rel="noopener noreferrer"&gt;pre-trained embedding models available on Hugging Face&lt;/a&gt; that you can use to create vector embeddings. &lt;a href="https://www.sbert.net/" rel="noopener noreferrer"&gt;Sentence Transformers (SBERT)&lt;/a&gt; is a library that makes it easy to use these models for vector embedding, as well as cross-encoding for reranking. It even has tools for finetuning models, if that’s something that might be of use.&lt;/p&gt;

&lt;p&gt;You can install the library with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;sentence_transformers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A popular local model for vector embedding is &lt;a href="https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2" rel="noopener noreferrer"&gt;all-MiniLM-L6-v2&lt;/a&gt;. It’s trained as a good all-rounder that produces a 384-dimension vector from a chunk of text.&lt;/p&gt;

&lt;p&gt;To use it, import &lt;code&gt;sentence_transformers&lt;/code&gt; and create a model using the identifier from Hugging Face, in this case "all-MiniLM-L6-v2". If you want to use a model that isn't in the &lt;a href="https://huggingface.co/sentence-transformers/" rel="noopener noreferrer"&gt;sentence-transformers project&lt;/a&gt;, like the multilingual &lt;a href="https://huggingface.co/BAAI/bge-m3" rel="noopener noreferrer"&gt;BGE-M3&lt;/a&gt;, you can use the organization to identify the model too, like, "BAAI/BGE-M3". Once you've loaded the model, use the &lt;code&gt;encode&lt;/code&gt; method to create the vector embedding. The full code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;


&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; [ 1.95171311e-03  1.51085425e-02  3.36140348e-03  2.48030387e-02 ... ]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you pass an array of texts to the model, they’ll all be encoded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;


&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sentences&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; [[ 0.00195174  0.01510859  0.00336139 ...  0.07971715  0.09885529  -0.01855042]
# [-0.04523939 -0.00046248  0.02036596 ...  0.08779042  0.04936493  -0.06218244]
# [-0.05453169  0.01125113 -0.00680178 ...  0.06443197  0.08771271  -0.00063468]]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are many &lt;a href="https://huggingface.co/models?pipeline_tag=feature-extraction&amp;amp;library=sentence-transformers&amp;amp;sort=trending" rel="noopener noreferrer"&gt;more models you can use to generate vector embeddings with the sentence-transformers library&lt;/a&gt; and, because you’re running locally, you can try them out to see which is most appropriate for your data. You do need to watch out for any restrictions that these models might have. For example, the all-MiniLM-L6-v2 model doesn’t produce good results for more than 128 tokens and can only handle a maximum of 256 tokens. BGE-M3, on the other hand, can encode up to 8,192 tokens. However, the BGE-M3 model is a couple of gigabytes in size and all-MiniLM-L6-v2 is under 100MB, so there are space and memory constraints to consider, too.&lt;/p&gt;

&lt;p&gt;Local embedding models like this are useful when you’re experimenting on your laptop, or if you have &lt;a href="https://www.sbert.net/docs/installation.html#install-pytorch-with-cuda-support" rel="noopener noreferrer"&gt;hardware that PyTorch can use to speed up the encoding process&lt;/a&gt;. It’s a good way to get comfortable running different models and seeing how they interact with your data.&lt;/p&gt;

&lt;p&gt;If you don't want to run your models locally, there are plenty of available APIs you can use to create embeddings for your documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  APIs
&lt;/h2&gt;

&lt;p&gt;There are several services that make embedding models available as APIs. These include LLM providers like &lt;a href="https://platform.openai.com/docs/guides/embeddings" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;, &lt;a href="https://ai.google.dev/gemini-api/docs/embeddings" rel="noopener noreferrer"&gt;Google&lt;/a&gt;, or &lt;a href="https://docs.cohere.com/docs/cohere-embed" rel="noopener noreferrer"&gt;Cohere&lt;/a&gt;, as well as specialist providers like &lt;a href="https://jina.ai/embeddings/" rel="noopener noreferrer"&gt;Jina AI&lt;/a&gt; or model hosts like &lt;a href="https://docs.fireworks.ai/guides/querying-embeddings-models" rel="noopener noreferrer"&gt;Fireworks&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;These API providers provide HTTP APIs, often with a Python package to make it easy to call them. You will typically require an API key from the service. Once you have that setup you can generate vector embeddings by sending your text to the API.&lt;/p&gt;

&lt;p&gt;For example, with Google's &lt;a href="https://ai.google.dev/gemini-api/docs/sdks" rel="noopener noreferrer"&gt;google-genai SDK&lt;/a&gt; and a Gemini API key you can generate a vector embedding with their &lt;a href="https://developers.googleblog.com/en/gemini-embedding-text-model-now-available-gemini-api/" rel="noopener noreferrer"&gt;experimental Gemini embedding model&lt;/a&gt; like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;


&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-embedding-exp-03-07&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each API can be different, though many providers do make OpenAI-compatible APIs. However, each time you try a new provider you might find you have a new API to learn. Unless, of course, you try one of the available frameworks that are intended to simplify this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frameworks
&lt;/h2&gt;

&lt;p&gt;There are several projects available, like &lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; or &lt;a href="https://docs.llamaindex.ai/en/stable/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt;, that create abstractions over the common components of the GenAI ecosystem, including embeddings.&lt;/p&gt;

&lt;p&gt;Both LangChain and LlamaIndex have methods for creating vector embeddings via APIs or local models, all with the same interface. For example, you can create the same Gemini embedding as the code snippet above with LangChain like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_google_genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;


&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-embedding-exp-03-07&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;google_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a comparison, here is how you would generate an embedding using an OpenAI embeddings model and LangChain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;


&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We had to change the name of the import and the API key we used, but otherwise the code is identical. This makes it easy to swap them out and experiment.&lt;/p&gt;

&lt;p&gt;If you're using LangChain to build your entire RAG pipeline, these embeddings fit in well with the vector database interfaces. You can provide an embedding model to the database object and LangChain handles generating the embeddings as you insert documents or perform queries. For example, here's how you can combine the Google embeddings model with the &lt;a href="https://python.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;LangChain wrapper for Astra DB&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_google_genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_astradb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AstraDBVectorStore&lt;/span&gt;


&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-embedding-exp-03-07&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;google_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AstraDBVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;astra_vector_langchain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;api_endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# a list of document objects to store in the db
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can use the same &lt;code&gt;vector_store&lt;/code&gt; object and associated embeddings to perform the vector search, too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LlamaIndex has a similar set of abstractions that enable you to combine different embedding models and vector stores. Check out this &lt;a href="https://docs.llamaindex.ai/en/stable/understanding/rag/" rel="noopener noreferrer"&gt;LlamaIndex introduction to RAG&lt;/a&gt; to learn more.&lt;/p&gt;

&lt;p&gt;If you're new to embeddings, &lt;a href="https://python.langchain.com/docs/integrations/text_embedding/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; has a handy list of embedding models and providers that can help you find different options to try.&lt;/p&gt;

&lt;h2&gt;
  
  
  Directly in the database
&lt;/h2&gt;

&lt;p&gt;The methods we’ve talked through so far have involved creating a vector independently of storing it in or using it to search against a vector database. When you want to store those vectors in a &lt;a href="https://www.datastax.com/guides/what-is-a-vector-database?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;vector database like Astra DB&lt;/a&gt;, it looks a bit like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;


&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above assumes that you have already created your vector-enabled collection with the right number of dimensions for the model you’re using.&lt;/p&gt;

&lt;p&gt;Performing a vector search then looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In these examples, you have to create your vectors first, before storing or searching against the database with them. In the case of the frameworks, you might not see this happen, as it has been abstracted away, but the operations are being performed.&lt;/p&gt;

&lt;p&gt;With Astra DB, you can have the database generate the vector embeddings for you as you either insert the document into the collection or at the point of performing the search. This is called &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize&lt;/a&gt; and it simplifies a crucial step in your RAG pipeline.&lt;/p&gt;

&lt;p&gt;To use Vectorize, you first need to set up an embedding provider integration. There’s one built-in integration that you can use with no extra work; the &lt;a href="https://www.datastax.com/integrations/vectorize-with-nvidia?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;NVIDIA NV-Embed-QA model&lt;/a&gt;, or you can choose one of the other embeddings providers and configure them with your API.&lt;/p&gt;

&lt;p&gt;When you create a collection, you can choose which embedding provider you want to use with the requisite number of dimensions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8e1poo6z0jw73ilw8wn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8e1poo6z0jw73ilw8wn.png" alt="A screen shot of the form to create a collection. After the name field, there is a drop down list where you can select the embedding provider you want to use, in this example, NVIDIA has been chosen. Then there are fields for the number of dimensions and the similarity metric to use." width="800" height="589"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you set up your collection this way you can add content and have it automatically vectorized by using the special property &lt;code&gt;$vectorize&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, when a user query comes in, you can perform a vector search by sorting using the &lt;code&gt;$vectorize&lt;/code&gt; property. Astra DB will create the vector embedding and then make the search in one step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are several advantages to this approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Astra DB team has done the work to make the embedding creation robust already&lt;/li&gt;
&lt;li&gt;Making two separate API calls to create embeddings and then store them is often slower than letting Astra DB handle it&lt;/li&gt;
&lt;li&gt;Using the built-in NVIDIA embeddings model is even quicker than that&lt;/li&gt;
&lt;li&gt;You have less code to write and maintain&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  A world of vector embedding options
&lt;/h2&gt;

&lt;p&gt;As we have seen, there are many choices you can make in how to implement vector embeddings, which model you use, and which provider you use. It's an important step in your RAG pipeline and it is important to spend the time to find out which model and method is right for your application and your data.&lt;/p&gt;

&lt;p&gt;You can choose to host your own models, rely on third-party APIs, abstract the problem away through frameworks, or entrust Astra DB to create embeddings for you. Of course, if you want to avoid code entirely, then you can &lt;a href="https://www.datastax.com/products/langflow?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;drag-and-drop your components into place with Langflow&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want to chat more about vector embeddings and RAG, drop into the &lt;a href="https://discord.gg/datastax" rel="noopener noreferrer"&gt;DataStax Devs Discord&lt;/a&gt; or drop me an email at &lt;a href="mailto:phil.nash@datastax.com"&gt;phil.nash@datastax.com&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently asked questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What are vector embeddings?
&lt;/h3&gt;

&lt;p&gt;Vector embeddings are numerical representations of text in multi-dimensional space used for tasks like document retrieval and recommendation systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  What steps are involved in creating vector embeddings for a retrieval-augmented generation (RAG) app?
&lt;/h3&gt;

&lt;p&gt;To create vector embeddings, you need to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Collect unstructured data&lt;/li&gt;
&lt;li&gt;Split data into chunks&lt;/li&gt;
&lt;li&gt;Turn chunks into vector embeddings&lt;/li&gt;
&lt;li&gt;Store embeddings in a vector database&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  How can I create vector embeddings locally in Python?
&lt;/h3&gt;

&lt;p&gt;You can create vector embeddings locally in Python using pre-trained embedding models from the HuggingFace, specifically using the sentence-transformers library.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are some limitations of local embedding models?
&lt;/h3&gt;

&lt;p&gt;Local embedding models handle a limited number of tokens effectively, and larger models require substantial memory and storage.&lt;/p&gt;

&lt;h3&gt;
  
  
  How can I create vector embeddings using an API?
&lt;/h3&gt;

&lt;p&gt;You can create vector embeddings using APIs provided by services such as OpenAI, Google, and Cohere.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are there frameworks to simplify embedding creation?
&lt;/h3&gt;

&lt;p&gt;Yes, frameworks like LangChain and LlamaIndex offer standardized interfaces that abstract the complexities of embedding models and APIs.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Astra Vectorize, and how does it simplify the embedding process?
&lt;/h3&gt;

&lt;p&gt;Astra Vectorize enables Astra DB to automatically generate vector embeddings as documents are inserted or queries are performed.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are the advantages of using Astra Vectorize?
&lt;/h3&gt;

&lt;p&gt;The advantages include simplified code maintenance, faster performance, improved efficiency, and robustness through pre-tested integrations.&lt;/p&gt;

</description>
      <category>python</category>
      <category>genai</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>How to Create Vector Embeddings in Node.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Thu, 03 Apr 2025 21:43:17 +0000</pubDate>
      <link>https://forem.com/datastax/how-to-create-vector-embeddings-in-nodejs-2khl</link>
      <guid>https://forem.com/datastax/how-to-create-vector-embeddings-in-nodejs-2khl</guid>
      <description>&lt;p&gt;When you’re building a &lt;a href="https://www.ibm.com/think/topics/retrieval-augmented-generation" rel="noopener noreferrer"&gt;retrieval-augmented generation (RAG)&lt;/a&gt; app, job number one is preparing your data. You’ll need to take your unstructured data and &lt;a href="https://philna.sh/blog/2024/09/18/how-to-chunk-text-in-javascript-for-rag-applications/" rel="noopener noreferrer"&gt;split it up into chunks&lt;/a&gt;, turn those chunks into &lt;a href="https://www.ibm.com/think/topics/vector-embedding" rel="noopener noreferrer"&gt;vector embeddings&lt;/a&gt;, and finally, store the embeddings in a &lt;a href="https://www.ibm.com/think/topics/vector-database" rel="noopener noreferrer"&gt;vector database&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There are many ways that you can create vector embeddings in JavaScript. In this post, we’ll investigate four ways to generate vector embeddings in Node.js: locally, via API, via a framework, and with &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB's Vectorize&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local vector embeddings
&lt;/h2&gt;

&lt;p&gt;There are lots of open-source models available on &lt;a href="https://huggingface.co/" rel="noopener noreferrer"&gt;HuggingFace&lt;/a&gt; that can be used to create vector embeddings. &lt;a href="https://huggingface.co/docs/transformers.js/en/index" rel="noopener noreferrer"&gt;Transformers.js&lt;/a&gt; is a module that lets you use machine learning models in JavaScript, both in the browser and Node.js. It uses the &lt;a href="https://onnxruntime.ai/" rel="noopener noreferrer"&gt;ONNX runtime&lt;/a&gt; to achieve this; it works with models that have published ONNX weights, of which there are plenty. Some of those models we can use to create vector embeddings.&lt;/p&gt;

&lt;p&gt;You can install the module with&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @xenova/transformers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://huggingface.co/docs/transformers.js/index#tasks" rel="noopener noreferrer"&gt;The package can actually perform many tasks&lt;/a&gt;, but &lt;a href="https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FeatureExtractionPipeline" rel="noopener noreferrer"&gt;feature extraction&lt;/a&gt; is what you want for generating vector embeddings.&lt;/p&gt;

&lt;p&gt;A popular, local model for vector embedding is &lt;a href="https://huggingface.co/Xenova/all-MiniLM-L6-v2" rel="noopener noreferrer"&gt;all-MiniLM-L6-v2&lt;/a&gt;. It’s trained as a good all-rounder and produces a 384-dimension vector from a chunk of text.&lt;/p&gt;

&lt;p&gt;To use it, import the &lt;code&gt;pipeline&lt;/code&gt; function from Transformers.js and create an extractor that will perform "feature-extraction" using your provided model. You can then pass a chunk of text to the extractor and it will return a tensor object which you can turn into a plain JavaScript array of numbers.&lt;/p&gt;

&lt;p&gt;All in all, it looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@xenova/transformers&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;extractor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;feature-extraction&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Xenova/all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;pooling&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mean&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [-0.004044221248477697,  0.026746056973934174,   0.0071970801800489426, ... ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can actually embed multiple texts at a time if you pass an array to the extractor. Then you can call tolist on the response and that will return you a list of arrays as your vectors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;pooling&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mean&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="c1"&gt;// [&lt;/span&gt;
&lt;span class="c1"&gt;//   [ -0.006129210349172354,  0.016346964985132217,   0.009711502119898796, ...],&lt;/span&gt;
&lt;span class="c1"&gt;//   [-0.053930871188640594,  -0.002175076398998499,   0.032391052693128586, ...],&lt;/span&gt;
&lt;span class="c1"&gt;//   [-0.05358131229877472,  0.021030642092227936, 0.0010665050940588117, ...]&lt;/span&gt;
&lt;span class="c1"&gt;// ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are &lt;a href="https://huggingface.co/models?pipeline_tag=feature-extraction&amp;amp;library=transformers.js" rel="noopener noreferrer"&gt;many models you can use to create vector embeddings from text&lt;/a&gt;, and, because you’re running locally, you can try them out to see which works best for your data. You should pay attention to the length of text that these models can handle. For example, the all-MiniLM-L6-v2 model does not provide good results for more than 128 &lt;a href="https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them" rel="noopener noreferrer"&gt;tokens&lt;/a&gt; and can handle a maximum of 256 tokens, so it’s useful for sentences or small paragraphs. If you have a bigger source of text data than that, you’ll need to &lt;a href="https://philna.sh/blog/2024/09/18/how-to-chunk-text-in-javascript-for-rag-applications/" rel="noopener noreferrer"&gt;split your data into appropriately sized chunks&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Local embedding models like this are useful if you’re experimenting on your own machine, or have the &lt;a href="https://onnxruntime.ai/docs/get-started/with-javascript/node.html" rel="noopener noreferrer"&gt;right hardware to run them efficiently when deployed&lt;/a&gt;. It's an easy way to get comfortable with different models and get a feel for how things work without having to sign up to a bunch of different API services.&lt;/p&gt;

&lt;p&gt;Having said that, there are a lot of useful vector embedding models available as an API, so let's take a look at them next.&lt;/p&gt;

&lt;h2&gt;
  
  
  APIs
&lt;/h2&gt;

&lt;p&gt;There is an abundance of services that provide embedding models as APIs. These include LLM providers, like &lt;a href="https://platform.openai.com/docs/guides/embeddings" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;, &lt;a href="https://ai.google.dev/gemini-api/docs/embeddings" rel="noopener noreferrer"&gt;Google&lt;/a&gt; or &lt;a href="https://docs.cohere.com/docs/cohere-embed" rel="noopener noreferrer"&gt;Cohere&lt;/a&gt;, as well as specialist providers like &lt;a href="https://www.voyageai.com/" rel="noopener noreferrer"&gt;Voyage AI&lt;/a&gt; or &lt;a href="https://jina.ai/embeddings/" rel="noopener noreferrer"&gt;Jina&lt;/a&gt;. Most providers have general purpose embedding models, but some provide models trained for specific datasets, like &lt;a href="https://docs.voyageai.com/docs/embeddings" rel="noopener noreferrer"&gt;Voyage AI's finance, law and code&lt;/a&gt; optimised models.&lt;/p&gt;

&lt;p&gt;These API providers provide HTTP APIs, often with an npm package to make it easy to call them. You’ll typically need an API key from the service and you can then generate embeddings by sending your text to the API.&lt;/p&gt;

&lt;p&gt;For example, you can &lt;a href="https://ai.google.dev/gemini-api/docs/embeddings" rel="noopener noreferrer"&gt;use Google's text embedding models through the Gemini API&lt;/a&gt; like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@google/generative-ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getGenerativeModel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-004&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;values&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [0.04574034, 0.038084425, -0.00916391, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each API is different though, so while making a request to create embeddings is normally fairly straightforward, you’ll likely have to learn a new method for each API you want to call—unless of course, you try one of the available frameworks that are intended to simplify this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frameworks
&lt;/h2&gt;

&lt;p&gt;There are many projects out there, like &lt;a href="https://js.langchain.com/docs/introduction/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; or &lt;a href="https://ts.llamaindex.ai/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt;, that create abstractions over the various parts of the GenAI toolchain, including embeddings.&lt;/p&gt;

&lt;p&gt;Both LangChain and LlamaIndex enable you to generate embeddings via APIs or local models, all with the same interface. For example, here’s how you can &lt;a href="https://js.langchain.com/docs/integrations/text_embedding/google_generativeai/" rel="noopener noreferrer"&gt;create the same embedding as above using the Gemini API and LangChain&lt;/a&gt; together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/google-genai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-004&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [0.04574034, 0.038084425, -0.00916391, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To compare, this is what it looks like to use the &lt;a href="https://js.langchain.com/docs/integrations/text_embedding/openai/" rel="noopener noreferrer"&gt;OpenAI embeddings model through LangChain&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OpenAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-large&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [0.009445431, -0.0073068426, -0.00814802, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Aside from changing the name of the import and sometimes the options, the embedding models all have a consistent interface to make it easier to swap them out.&lt;/p&gt;

&lt;p&gt;If you’re using LangChain to create your entire pipeline, these embedding interfaces work very well alongside the vector database interfaces. You can provide an embedding model to the database integration and LangChain handles generating the embeddings as you insert documents or perform vector searches. For example, here is how to embed some documents using Google's embeddings and store them in &lt;a href="https://js.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;Astra DB via LangChain&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/google-genai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AstraDBVectorStore&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/community/vectorstores/astradb&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-004&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;AstraDBVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// a list of document objects to put in the store&lt;/span&gt;
  &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// the embeddings model&lt;/span&gt;
  &lt;span class="nx"&gt;astraConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// config to connect to Astra DB&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you provide the embeddings model to the database object, you can then use it to perform vector searches too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similaritySearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LlamaIndex allows for similar creation of embedding models and vector stores that use them. Check out &lt;a href="https://ts.llamaindex.ai/docs/llamaindex/tutorials/rag" rel="noopener noreferrer"&gt;the LlamaIndex documentation on RAG&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As a bonus, the lists of models that &lt;a href="https://js.langchain.com/docs/integrations/text_embedding/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; and &lt;a href="https://ts.llamaindex.ai/docs/llamaindex/modules/embeddings" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt; integrate are good examples of popular embedding models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Directly in the database
&lt;/h2&gt;

&lt;p&gt;So far, the methods above mostly involve creating a vector embedding independently of storing the embedding in a vector database. When you want to store those vectors in a vector database like Astra DB, it looks a bit like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DataAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/astra-db-ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;$vector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This assumes you have already created a vector enabled collection with the correct number of dimensions for the model you are using.&lt;/p&gt;

&lt;p&gt;You can also search against the documents in your collection using a vector like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, you have to create your vectors first, and then store or search against the database with them. Even in the case of the frameworks, that process happens, but it’s just abstracted away.&lt;/p&gt;

&lt;p&gt;With Astra DB, you can have the database generate the embeddings for you as you’re inserting documents into a collection or as you perform a vector search against a collection.&lt;/p&gt;

&lt;p&gt;This is called &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB vectorize&lt;/a&gt;; here's how it works.&lt;/p&gt;

&lt;p&gt;First, set up an embedding provider integration. There is a built-in integration offering the &lt;a href="https://build.nvidia.com/nvidia/embed-qa-4" rel="noopener noreferrer"&gt;NVIDIA NV-Embed-QA model&lt;/a&gt;, or you can choose one of the other providers and configure them with your own API key.&lt;/p&gt;

&lt;p&gt;Then when you set up a collection, you can choose which embedding provider you want to use and set the correct number of dimensions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpb60b3qbxf9b2tbbp34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpb60b3qbxf9b2tbbp34.png" width="800" height="547"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, when you add a document to this collection, you can add the content using the special key &lt;code&gt;$vectorize&lt;/code&gt; and a vector embedding will be created.&lt;/p&gt;

&lt;p&gt;await collection.insertOne({&lt;br&gt;
  $vectorize: "A robot may not injure a human being or, through inaction, allow a human being to come to harm."&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;When you want to perform a vector search against this collection, you can sort by the special &lt;code&gt;$vectorize&lt;/code&gt; field and again, Astra DB will handle creating vector embeddings and then performing the search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Are robots allowed to protect themselve?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This has several advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It's robust, as Astra DB handles the interaction with the embedding provider&lt;/li&gt;
&lt;li&gt;It can be quicker than making two separate API calls to create embeddings and then store them&lt;/li&gt;
&lt;li&gt;It's less code for you to write&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Choose the method that works best for your application
&lt;/h2&gt;

&lt;p&gt;There are many models, providers, and methods you can use to turn text into vector embeddings. Creating vector embeddings from your content is a vital part of the RAG pipeline and it does require some experimentation to get it right for your data.&lt;/p&gt;

&lt;p&gt;You have the choice to host your own models, call on APIs, use a framework, or let Astra DB handle creating vector embeddings for you. And, if you want to avoid code altogether, you could choose to use &lt;a href="https://docs.langflow.org/chat-with-rag" rel="noopener noreferrer"&gt;Langflow's drag-and-drop interface to create your RAG pipeline&lt;/a&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>genai</category>
      <category>vectordatabase</category>
      <category>javascript</category>
    </item>
    <item>
      <title>5 GenAI Things You Didn't Know About Astra DB</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Thu, 06 Mar 2025 23:07:23 +0000</pubDate>
      <link>https://forem.com/datastax/5-genai-things-you-didnt-know-about-astra-db-3am9</link>
      <guid>https://forem.com/datastax/5-genai-things-you-didnt-know-about-astra-db-3am9</guid>
      <description>&lt;p&gt;Astra DB is a high-performance NoSQL database powered by Apache Cassandra® with built-in vector search, but that's just what &lt;a href="https://www.datastax.com/products/datastax-astra?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;the product page&lt;/a&gt; says. Not everything fits onto one page, so I wanted to share a few things that you might not already know about Astra DB and how it helps you to build accurate, low-latency, retrieval-augmented generation (RAG) powered generative AI apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB can create vector embeddings for you
&lt;/h2&gt;

&lt;p&gt;When ingesting data for a RAG application, there are several steps you need to take: document loading, text parsing, &lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;chunking text&lt;/a&gt;, &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-node-js?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;creating vector embeddings&lt;/a&gt;, and storing it in the database. Astra DB can simplify the process by combining those last two steps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.datastax.com/blog/simplifying-vector-embedding-generation-with-astra-vectorize?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize&lt;/a&gt; can create vector embeddings for your text chunks at the point of inserting them into the collection.&lt;/p&gt;

&lt;p&gt;When you create an Astra DB collection, you can choose one of the &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html#supported-embedding-providers?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;supported embedding models&lt;/a&gt;. There are models available from OpenAI (including Azure OpenAI), Voyage AI, Mistral AI, Jina AI, and Upstage. Astra DB also hosts NVIDIA embedding models that run in the same environment as the database, boosting performance—&lt;a href="https://www.datastax.com/blog/build-generative-ai-wikidata-datastax-nvidia#ingestion-workflow-3?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Wikidata reduced their data ingestion time from 30 days to two with Vectorize&lt;/a&gt;—and ensuring the data never leaves the database.&lt;/p&gt;

&lt;p&gt;Once you have set up your collection with your embedding provider of choice, ingesting data with Vectorize is a case of providing the text you want turned into a vector as a special &lt;code&gt;$vectorize&lt;/code&gt; property in the documents you are storing. In TypeScript, this looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DataAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/astra-db-ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then to perform a vector search against this collection you use the &lt;code&gt;$vectorize&lt;/code&gt; field to sort by your query.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can learn more about &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize in the documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB supports graph RAG
&lt;/h2&gt;

&lt;p&gt;Depending on your data, regular vector search can sometimes miss context, which makes it harder for &lt;a href="https://www.datastax.com/guides/what-is-a-large-language-model?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;large language models (LLMs)&lt;/a&gt; to answer certain queries. &lt;a href="https://www.datastax.com/guides/graph-rag?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Graph RAG&lt;/a&gt; is a technique that takes your documents, extracts links between them, and uses those links to retrieve extra contextual information at the retrieval stage. Providing extra linked context to an LLM makes for &lt;a href="https://www.datastax.com/blog/better-llm-integration-and-relevancy-with-content-centric-knowledge-graphs?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;more accurate and informed answers&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Astra DB supports graph RAG via LangChain. You can replace the &lt;code&gt;AstraDBVectorStore&lt;/code&gt; with &lt;a href="https://python.langchain.com/api_reference/astradb/graph_vectorstores/langchain_astradb.graph_vectorstores.AstraDBGraphVectorStore.html" rel="noopener noreferrer"&gt;&lt;code&gt;AstraDBGraphVectorStore&lt;/code&gt;&lt;/a&gt; and ensure you ingest your data in a way that extracts the links between documents. A simplified ingestion example that reads a URL, extracts HTML links, strips the HTML, and splits the text into chunks before storing in Astra DB (using Astra Vectorize to create embeddings) might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncHtmlLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.graph_vectorstores.extractors&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;HtmlLinkExtractor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LinkExtractorTransformer&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoupTransformer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_astradb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AstraDBGraphVectorStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CollectionVectorServiceOptions&lt;/span&gt;

&lt;span class="n"&gt;vectorize_options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CollectionVectorServiceOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nvidia&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NV-Embed-QA&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AstraDBGraphVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;graph&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;api_endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;collection_vector_service_options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vectorize_options&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.datastax.com/guides/graph-rag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.datastax.com/blog/build-graph-rag-with-unstructured-and-astra-db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AsyncHtmlLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;transformer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LinkExtractorTransformer&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nc"&gt;HtmlLinkExtractor&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;as_document_extractor&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
&lt;span class="n"&gt;bs4_transformer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BeautifulSoupTransformer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transformer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bs4_transformer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then to search Astra DB, you can use the graph store's &lt;code&gt;traversal_search&lt;/code&gt; method to first retrieve a number of document chunks (&lt;em&gt;k&lt;/em&gt;), before traversing the graph to the specified depth for additional chunks. In this case, we perform the search initially finding four chunks using a similarity search and then traversing the graph to a depth of two to return related chunks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;traversal_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;traversal_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are the differences between Graph RAG and naive RAG?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check out this &lt;a href="https://www.datastax.com/blog/build-graph-rag-with-unstructured-and-astra-db?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;full tutorial on building graph RAG with Unstructured and Astra DB&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB supports ColBERT
&lt;/h2&gt;

&lt;p&gt;Graph RAG can help if your context is spread across chunks, but there are other situations where graph RAG won't necessarily help. If your data contains terms that aren't in the training data of your embedding model, it can be difficult to get accurate similarity search results.&lt;/p&gt;

&lt;p&gt;One way to overcome this is to use &lt;a href="https://thenewstack.io/overcoming-the-limits-of-rag-with-colbert/" rel="noopener noreferrer"&gt;ColBERT&lt;/a&gt;. ColBERT creates a vector per token in a body of text, creating a sliding window of context over entire passages and capturing unknown context much better. This does require more storage for the extra vectors, but if accuracy is your priority, it’s worthwhile.&lt;/p&gt;

&lt;p&gt;You can use ColBERT with Astra DB in LangChain by using the RAGStack implementation.&lt;/p&gt;

&lt;p&gt;To ingest the data, you can use the &lt;code&gt;ColbertEmbeddingModel&lt;/code&gt; and &lt;code&gt;ColbertVectorStore&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ragstack_colbert&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ColbertEmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ColbertVectorStore&lt;/span&gt;

&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ColbertEmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_astra&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;astra_token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;database_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_DATABASE_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;keyspace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default_keyspace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ColbertVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;YOUR_LIST_OF_TEXTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;myDocs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then performing a similarity search is pretty much the same as any other vector store search in LangChain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ragstack_colbert&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ColbertEmbeddingModel&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ragstack_langchain.colbert&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ColbertVectorStore&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;LangchainColbertVectorStore&lt;/span&gt;

&lt;span class="n"&gt;colbert_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ColbertEmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;colbert_database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_astra&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;astra_token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;YOUR_ASTRA_DB_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;database_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;YOUR_ASTRA_DB_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;keyspace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default_keyspace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangchainColbertVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;colbert_database&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;colbert_embedding&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is ColBERT?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check out this full tutorial on &lt;a href="https://www.datastax.com/blog/highly-accurate-retrieval-for-your-rag-application-with-colbert-and-astra-db?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;using ColBERT with Astra DB&lt;/a&gt;, or for a faster alternative, &lt;a href="https://www.datastax.com/blog/colbert-live-makes-your-vector-database-smarter?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Jonathan Ellis's ColBERT Live!, which uses Answer AI's colbert-small-v1 model and is supported by Astra DB&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB indexes your vectors live
&lt;/h2&gt;

&lt;p&gt;Your vector database needs to be both accurate and speedy in order to ensure the performance of your application. When you are ingesting or updating data in your collection, rebuilding the index takes time and leaves you with slow queries or out of date data.&lt;/p&gt;

&lt;p&gt;Astra DB's vector indexing capabilities are a combination of &lt;a href="https://docs.datastax.com/en/cql/hcd/develop/indexing/sai/sai-overview.html?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Cassandra's storage-attached indexing (SAI)&lt;/a&gt; and &lt;a href="https://github.com/jbellis/jvector" rel="noopener noreferrer"&gt;JVector, a non-blocking, concurrent, graph-based vector index&lt;/a&gt;. What this means is that Astra DB doesn't need to rebuild or block access to its index when you are inserting vectors, they are updated live.&lt;/p&gt;

&lt;p&gt;The upshot of this is high throughput and accuracy even under mixed loads of reads and writes. Check out &lt;a href="https://www.datastax.com/blog/5-vector-search-challenges-and-how-we-solved-them-in-apache-cassandra#problem-3-concurrency-4?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;this benchmark of throughput and accuracy against Pinecone&lt;/a&gt;, particularly when Pinecone is performing indexing. Astra DB doesn't sacrifice throughput or accuracy under load; it will always be there for your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB is integrated in all your favourite frameworks
&lt;/h2&gt;

&lt;p&gt;We've seen so far in this post that &lt;a href="https://python.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;Astra DB is available in LangChain&lt;/a&gt;, but you can also find it in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://js.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;LangChain.JS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.llamaindex.ai/en/stable/examples/vector_stores/AstraDBIndexDemo/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt; and &lt;a href="https://ts.llamaindex.ai/docs/api/classes/AstraDBVectorStore" rel="noopener noreferrer"&gt;LlamaIndex.TS&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datastax.com/blog/using-genai-to-find-a-needle-with-haystack-and-astra-db?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Haystack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://mastra.ai/examples/rag/insert-embedding-in-astra" rel="noopener noreferrer"&gt;Mastra&lt;/a&gt; (a newer framework, built by the team behind &lt;a href="https://www.gatsbyjs.com/" rel="noopener noreferrer"&gt;Gatsby&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And of course Astra DB is integrated into &lt;a href="https://www.langflow.org/" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt;. Deeply integrated! Once you enter your application token into the Astra DB component, your databases will automatically load. Then once you select your database, you can pick the collection you need too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhydqtvi2dq3i1i38wb9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhydqtvi2dq3i1i38wb9.gif" alt="An animation showing how to use the Astra DB Langflow component. After you set an Application Token a dropdown option for Database appears, populated with your databases. Once you pick a database, a dropdown option for Collection appears allowing you to pick the collection to use." width="412" height="760"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can even create a new database from within Langflow. Oh, and Langflow supports using Astra Vectorize when ingesting or performing vector search too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u2p8mqguohxqve1i2jg.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u2p8mqguohxqve1i2jg.gif" alt="An animation showing an Astra DB component in Langflow. When changing the collection to a Vectorize powered collection, the component updates to Astra Vectorize and disconnects the embedding model, which we then delete." width="600" height="713"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Langflow is a great visual way to build agents, and Astra DB makes it easy to build RAG or agentic RAG within Langflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB is ready to help you build transformative AI
&lt;/h2&gt;

&lt;p&gt;Whether you're looking to build with Langflow or any number of other frameworks, or try out alternative vector searches like graph RAG or ColBERT, Astra DB is there to help. And it will do it quickly, creating vectors for you via Vectorize and indexing them live so your data is always up to date.&lt;/p&gt;

&lt;p&gt;There are so many different applications you can build; check out examples like this &lt;a href="https://www.datastax.com/blog/building-resumai-langflow-astra-db-openai?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;AI resume assistant&lt;/a&gt;, &lt;a href="https://www.datastax.com/blog/rag-voice-agent-twilio-openai-astra-db-node-js?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;RAG-powered voice agent&lt;/a&gt;, or &lt;a href="https://www.datastax.com/blog/building-hum-to-search-music-recognition-app-vector-search?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;hum-to-search music recognition app&lt;/a&gt;, all powered by Astra DB.&lt;/p&gt;

&lt;p&gt;From chat bots to autonomous agents, Astra DB supports you in building the GenAI apps that are going to transform your business.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>vectordatabase</category>
      <category>rag</category>
    </item>
    <item>
      <title>How to Stream Responses from the Langflow API in Node.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 05 Mar 2025 21:34:53 +0000</pubDate>
      <link>https://forem.com/datastax/how-to-stream-responses-from-the-langflow-api-in-nodejs-41l5</link>
      <guid>https://forem.com/datastax/how-to-stream-responses-from-the-langflow-api-in-nodejs-41l5</guid>
      <description>&lt;p&gt;Building flows and AI agents in Langflow is one of the fastest ways to experiment with generative AI. Once you've built your flow, you’ll want to integrate it into your own application. Langflow exposes an API for this; &lt;a href="https://www.datastax.com/blog/use-langflow-api-in-node-js?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;we’ve written before about how to use it in Node.js&lt;/a&gt;. We've also seen that &lt;a href="https://www.datastax.com/blog/fetch-streams-api-for-faster-ux-generative-ai-apps?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;streaming GenAI outputs makes for a better user experience&lt;/a&gt;. So today, we're going to combine the two and show you how to stream results from your Langflow flows in Node.js.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvt09ihkjb1c9tg09gpkf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvt09ihkjb1c9tg09gpkf.png" alt="Example code for using the JavaScript Langflow client. It reads:  import { LangflowClient } from " width="800" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Using the Langflow client
&lt;/h2&gt;

&lt;p&gt;The easiest way to use the Langflow API is with the &lt;a href="https://www.npmjs.com/package/@datastax/langflow-client" rel="noopener noreferrer"&gt;@datastax/langflow-client npm module&lt;/a&gt;. You can get started with the client by installing the module with npm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @datastax/langflow-client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Langflow client can be used with both self-hosted and DataStax-hosted Langflow. You can see in-depth examples of &lt;a href="https://www.datastax.com/blog/use-langflow-api-in-node-js#initializing-for-datastax-hosted-langflow-2?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;how to set it up for either version of Langflow in this blog post&lt;/a&gt;. But the quick version is that for either type of Langflow, you start by importing the client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LangflowClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/langflow-client&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For self-hosted Langflow you need the URL where you’re hosting Langflow and, if you've set up user authorisation, an &lt;a href="https://docs.langflow.org/configuration-api-keys" rel="noopener noreferrer"&gt;API key&lt;/a&gt;. You then initialise the client with both:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseURL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://localhost:7860&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LangflowClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For DataStax-hosted Langflow, &lt;a href="https://www.datastax.com/blog/use-langflow-api-in-node-js#initializing-for-datastax-hosted-langflow-2?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;you need your Langflow ID&lt;/a&gt; and to generate an API key. Then you create a client with the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;langflowId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_LANGFLOW_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LangflowClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;langflowId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Streaming with the Langflow client
&lt;/h2&gt;

&lt;p&gt;To stream through the API, you need a flow that’s set up for streaming responses. A streaming flow needs a model with streaming capabilities and the stream flag turned on, connected to a chat output. The basic prompting example, with streaming turned on, is a good example of this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qgitvo2l7u4vbotakif.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qgitvo2l7u4vbotakif.png" alt="A screenshot of a Langflow flow with a chat input and prompt that are both connected to the OpenAI model. The OpenAI model component has the Stream setting enabled and it is connected to a Chat Output component." width="800" height="557"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you don't already have a flow, you can use the basic prompting flow as an example.&lt;/p&gt;

&lt;p&gt;Once you have your flow in place, open the API modal and get the flow ID.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fevtcg4n1p5ge30jdvbxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fevtcg4n1p5ge30jdvbxl.png" alt="A screenshot of the Langflow API modal. It shows the API URL and points out that the flow ID can be found in the URL after api/v1/run." width="800" height="557"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the flow ID and the Langflow client, you can create a flow object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flowId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_FLOW_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flowId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To stream a response from the flow, you can use the &lt;code&gt;[stream function](https://www.npmjs.com/package/@datastax/langflow-client#streaming)&lt;/code&gt;. The response is a &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#examples" rel="noopener noreferrer"&gt;&lt;code&gt;ReadableStream&lt;/code&gt;&lt;/a&gt; that you can iterate asynchronously over.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are three types of event that the stream emits; this is what each of them means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;add_message&lt;/code&gt;: a message has been added to the chat. It can refer to a human input message or a response from an AI.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;token&lt;/code&gt;: a token has been emitted as part of a message being generated by the model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;end&lt;/code&gt;: all tokens have been returned; this message will also contain the same full response that you get from a non-streaming request&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to log out just the text from a flow response you can do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;token&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8sltv8sbs7dknlt3s2z5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8sltv8sbs7dknlt3s2z5.gif" alt="An animation of a terminal program running the code that logs each chunk. It logs its response one word at a time, but quickly." width="558" height="404"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The stream function takes all the same arguments as the &lt;a href="https://www.npmjs.com/package/@datastax/langflow-client#calling-a-flow" rel="noopener noreferrer"&gt;run function&lt;/a&gt;, so you can provide tweaks for your components, too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating with Express
&lt;/h2&gt;

&lt;p&gt;If you want to make an API request from an &lt;a href="https://expressjs.com/" rel="noopener noreferrer"&gt;Express server&lt;/a&gt; and then stream it to your own front-end, you can do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/stream&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/plain&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Transfer-Encoding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chunked&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;token&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We explored &lt;a href="https://www.datastax.com/blog/fetch-streams-api-for-faster-ux-generative-ai-apps" rel="noopener noreferrer"&gt;how you can handle a stream on the front-end in this blog post&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stream your flows
&lt;/h2&gt;

&lt;p&gt;Langflow enables you to rapidly build, experiment with, and deploy GenAI applications and with the &lt;a href="https://www.npmjs.com/package/@datastax/langflow-client" rel="noopener noreferrer"&gt;JavaScript Langflow client&lt;/a&gt; you can easily stream those responses in your JavaScript applications.&lt;/p&gt;

&lt;p&gt;Please do try out the Langflow client; if you have any issues, please raise them on &lt;a href="https://github.com/datastax/langflow-client-ts" rel="noopener noreferrer"&gt;the GitHub repo&lt;/a&gt;. If you're looking for more inspiration for building AI agents with Langflow, check out these posts that cover how to &lt;a href="https://www.datastax.com/blog/build-simple-ai-agent-with-langflow-composio" rel="noopener noreferrer"&gt;build an agent that can manage your calendar with Langflow and Composio&lt;/a&gt; or see how you can &lt;a href="https://www.datastax.com/blog/local-ai-using-ollama-with-agents" rel="noopener noreferrer"&gt;build local agents with Langflow and Ollama&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>node</category>
      <category>ai</category>
      <category>langflow</category>
      <category>genai</category>
    </item>
    <item>
      <title>Build a RAG-Powered Voice Agent with Twilio Voice, OpenAI, Astra DB, and Node.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 19 Feb 2025 23:08:48 +0000</pubDate>
      <link>https://forem.com/datastax/build-a-rag-powered-voice-agent-with-twilio-voice-openai-astra-db-and-nodejs-266n</link>
      <guid>https://forem.com/datastax/build-a-rag-powered-voice-agent-with-twilio-voice-openai-astra-db-and-nodejs-266n</guid>
      <description>&lt;p&gt;With the &lt;a href="https://platform.openai.com/docs/guides/realtime" rel="noopener noreferrer"&gt;OpenAI Realtime API&lt;/a&gt;, you can build speech-to-speech applications that let you interact directly with a generative AI model by speaking with it. Talking directly to a model feels really natural, and the Realtime API makes it possible to build experiences like this into your own applications and businesses.&lt;/p&gt;

&lt;p&gt;One example of this was built by Twilio: it enables you to  &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;connect a phone call to GPT-4o with Node.js&lt;/a&gt; (or, if you prefer, &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-python" rel="noopener noreferrer"&gt;Python&lt;/a&gt;). The example is great, but it only shows connecting to a plain GPT-4o with a system prompt that encourages owl facts and jokes. Much as I like owl facts, I wanted to see what else we could achieve with a voice agent like this.&lt;/p&gt;

&lt;p&gt;In this post, we'll show you how to extend the original assistant into an agent that can choose to use tools to augment its response. We'll give it additional, up-to-date knowledge via &lt;a href="https://www.datastax.com/guides/what-is-retrieval-augmented-generation?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;retrieval-augmented generation (RAG)&lt;/a&gt; using Astra DB.&lt;/p&gt;

&lt;p&gt;Want to try it out before we dive into the details? Call (855) 687-9438 (that's 855-6-TSWIFT) and have a chat!&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;First, you’ll need to set up the application from the &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;Twilio blog post&lt;/a&gt;, so you'll need a Twilio account and an OpenAI API key. Make sure you can make a call and chat with the bot successfully.&lt;/p&gt;

&lt;p&gt;You will also need a &lt;a href="https://astra.datastax.com/signup?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;free DataStax account&lt;/a&gt; so you can set up RAG with Astra DB.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we’re going to build
&lt;/h2&gt;

&lt;p&gt;We already have a voice-capable bot that you can speak to over the phone. We're going to gather some up-to-date data and store it in Astra DB to help the bot answer questions.&lt;/p&gt;

&lt;p&gt;The OpenAI Realtime API enables you to define tools that the model can use to execute functions and extend its capabilities. We’ll give the model a tool that enables it to search the database for additional information (this is an example of &lt;a href="https://www.youtube.com/watch?v=MYPDsV_825U" rel="noopener noreferrer"&gt;agentic RAG&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  Ingesting data
&lt;/h2&gt;

&lt;p&gt;To test out this agent, we're going to write a quick script to load and parse a web page, turn the content into chunks, turn those chunks into vector embeddings, and store them in Astra DB.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create your database
&lt;/h3&gt;

&lt;p&gt;To kick this process off, you'll need to create a database. &lt;a href="https://astra.datastax.com?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;Log into your DataStax account&lt;/a&gt; and, on the Astra DB dashboard, click &lt;em&gt;Create a Database&lt;/em&gt;. Choose a &lt;em&gt;Serverless (Vector)&lt;/em&gt; database, give it a name, and pick a provider and region. That will take a couple of minutes to provision. While it's doing that, have a think about some good web pages you might want to ingest into this database.&lt;/p&gt;

&lt;p&gt;Once the database is ready, click on the &lt;em&gt;Data Explorer&lt;/em&gt; tab and then the &lt;em&gt;Create Collection +&lt;/em&gt; button. Give your collection a name, ensure it is a vector-enabled collection and choose NVIDIA as the embedding generation method. This will &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;automatically generate vector embeddings for the content we insert into the collection&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connect to the database
&lt;/h3&gt;

&lt;p&gt;Open the &lt;a href="https://github.com/twilio-samples/speech-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;application code&lt;/a&gt; in your favourite text editor. To get the application running, you’ll have created a .env file and populated it with your OpenAI API key (and if you didn't do that yet, now is definitely the time). Open that &lt;em&gt;.env&lt;/em&gt; file and add some more environment variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fill in the variables with the information from your database. You can find the API endpoint and generate an application token from the database overview in the Astra DB dashboard. Enter the name of the collection you just created, too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhwc58vn9zlqz8r250tw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhwc58vn9zlqz8r250tw.png" alt="A screenshot of the database details in the DataStax dashboard. On the right of the page you can see your API endpoint and generate an application token. The collection can be seen in the Data Explorer tab." width="800" height="639"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we can connect to the database in the application. Install the Astra DB client from npm.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @datastax/astra-db-ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a new file in the application called db.js. Open the file and enter the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DataAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/astra-db-ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;dotenv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code loads the client from the Astra DB module and the variables in the &lt;em&gt;.env&lt;/em&gt; file into the environment. It then uses those environment variables as credentials to connect to the collection, and exports the collection object to be used elsewhere in the application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get some data
&lt;/h3&gt;

&lt;p&gt;Now let's create a script that loads and parses a web page, then splits it into chunks and stores it in Astra DB. This script is going to combine some of the techniques in blog posts about &lt;a href="https://www.datastax.com/blog/html-content-retrieval-augmented-generation-readability-js?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;scraping web pages&lt;/a&gt;, &lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;chunking text&lt;/a&gt;, and &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-node-js?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;creating vector embeddings&lt;/a&gt;. To read more in depth about those, check out those posts.&lt;/p&gt;

&lt;p&gt;Install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @langchain/textsplitters @mozilla/readability jsdom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a file called &lt;em&gt;ingest.js&lt;/em&gt; and copy the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;RecursiveCharacterTextSplitter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/textsplitters&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Readability&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@mozilla/readability&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;JSDOM&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;jsdom&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./db.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;parseArgs&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;node:util&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;values&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseArgs&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;short&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;u&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;values&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSDOM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;chunkOverlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splitText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertMany&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;uses the Node.js &lt;a href="https://nodejs.org/api/util.html#utilparseargsconfig" rel="noopener noreferrer"&gt;argument parser&lt;/a&gt; to get a URL from the command line arguments&lt;/li&gt;
&lt;li&gt;loads the web page at that URL&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datastax.com/blog/html-content-retrieval-augmented-generation-readability-js?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;parses the content from the page using Readability.js and JSDOM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;splits the text into 500 character chunks with 100 character overlap&lt;/a&gt; using the &lt;code&gt;RecursiveCharacterTextSplitter&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;turns the chunks into objects where the chunk of text becomes the &lt;code&gt;$vectorize&lt;/code&gt; property&lt;/li&gt;
&lt;li&gt;inserts all the documents into the collection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://www.datastax.com/blog/simplifying-vector-embedding-generation-with-astra-vectorize?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;Using the &lt;code&gt;$vectorize&lt;/code&gt; property tells Astra DB to automatically create vector embeddings&lt;/a&gt; for this content.&lt;/p&gt;

&lt;p&gt;We can now run this file from the command line. For example, here's how to ingest the Wikipedia page on Taylor Swift:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node ingest.js &lt;span class="nt"&gt;--url&lt;/span&gt; https://en.wikipedia.org/wiki/Taylor_Swift
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this command has been run, check the collection in the DataStax dashboard to see the contents and the vectors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy54sly6h2c8zijyiejm3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy54sly6h2c8zijyiejm3.png" alt="Viewing the Data Explorer in Astra DB. There should be a table of the content that you have ingested, in the $vectorize column you will find the text data and in the $vector column, the vector data that the database created for you." width="800" height="639"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Build the voice agent
&lt;/h2&gt;

&lt;p&gt;To turn our existing voice assistant into an agent that can choose to search the database for more information, we need to provide it with a tool, or function, that it can choose to use.&lt;/p&gt;

&lt;p&gt;Create a new file called tools.js and open it in your editor. Start by importing &lt;code&gt;collection&lt;/code&gt; from &lt;em&gt;db.js&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./db.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we need to create the function that the agent can use to search the database.&lt;/p&gt;

&lt;p&gt;When the OpenAI agent provides parameters to call a function with, it does so as an object. So the function should receive an object, from which we can destruct to extract the query. We'll then use the query to perform a vector search against our collection.&lt;/p&gt;

&lt;p&gt;We can use Astra DB Vectorize to automatically create a vector embedding of the query. We'll also limit the results to the top 10 and ensure we return the text from the chunks by selecting &lt;code&gt;$vectorize&lt;/code&gt; in the &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/documents.html?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent#projection-operations" rel="noopener noreferrer"&gt;projection&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Calling &lt;code&gt;find&lt;/code&gt; on the collection with these arguments will return a cursor, which we can turn into an array by calling &lt;code&gt;toArray&lt;/code&gt;. We then iterate over the array of documents, extracting just the text and then joining the resulting array with a newline to create a single string result that can be provided as context to the agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;taylorSwiftFacts&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="p"&gt;{},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;projection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I've called the function &lt;code&gt;taylorSwiftFacts&lt;/code&gt; because that's what I loaded with my ingestion script; feel free to use a different name.&lt;/p&gt;

&lt;p&gt;This is our first tool; we can write more, but for now we can just export this as an object of tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;taylorSwiftFacts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To help the model choose when to use this tool, it needs a description of what it can do and the arguments it expects. For each tool you provide a type, name, description, and the parameters.&lt;/p&gt;

&lt;p&gt;For our function the type will be "function" and the name is &lt;code&gt;taylorSwiftFacts&lt;/code&gt;. The description will tell the agent that we have up-to-date information about Taylor Swift that it can search for. The parameters are a JSON schema description of the arguments your function expects, this tool is relatively simple as it only requires one parameter called query, which is a string. The full description looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DESCRIPTIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;taylorSwiftFacts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Search for up to date information about Taylor Swift from her wikipedia page&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The search query&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our tool definition is complete for now, so let's add them to our agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling function calls in a voice agent
&lt;/h3&gt;

&lt;p&gt;We've been building supporting functions around the existing application so far, but to connect our tool to the agent we need to dig into the main body of code. Open &lt;em&gt;index.js&lt;/em&gt; in our editor and start by importing the tool we just defined:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Fastify&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fastify&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;WebSocket&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ws&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;fastifyFormBody&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@fastify/formbody&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;fastifyWs&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@fastify/websocket&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DESCRIPTIONS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./tools.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need to update the system prompt to more accurately describe what the agent is capable of with the tool available to it. Since we ingested the wikipedia page for Taylor Swift earlier, we can update it to behave like a Taylor Swift superfan.&lt;/p&gt;

&lt;p&gt;Find the &lt;code&gt;SYSTEM_MESSAGE&lt;/code&gt; constant and update with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_MESSAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a helpful and bubbly AI assistant who loves Taylor Swift. You can use your knowledge about Taylor Swift to answer questions, but if you don't know the answer, you can search for relevant facts with your available tools.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we need to provide the tool we have built to the agent. Find the &lt;code&gt;initializeSession&lt;/code&gt; function, it defines a &lt;code&gt;sessionUpdate&lt;/code&gt; object that includes all the details to initialize the agent. Add a tools property to the session object using the &lt;code&gt;DESCRIPTIONS&lt;/code&gt; object we imported earlier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sessionUpdate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;session.update&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="na"&gt;turn_detection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;server_vad&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="na"&gt;input_audio_format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;g711_ulaw&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;output_audio_format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;g711_ulaw&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;voice&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;VOICE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_MESSAGE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;modalities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;audio&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                    &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DESCRIPTIONS&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can also provide tools on a request-by-request basis, but this agent will benefit from access to this tool in all its interactions.&lt;/p&gt;

&lt;p&gt;Finally we need to handle the event when the model requests to use a tool. Find the event handler for when the connection to OpenAI receives a message, it looks like: &lt;code&gt;openAiWs.on('message', … )&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Change the event handler to an &lt;code&gt;async&lt;/code&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the Realtime API wants to use a tool, it sends an event with the type "response.done." Within the event object there are outputs, and if one of the outputs has a type of "function_call" we know the model wants to use one of its tools.&lt;/p&gt;

&lt;p&gt;The output provides the name of the function it wants to call and the arguments. We can look up the tool in our object of &lt;code&gt;TOOLS&lt;/code&gt; that we imported, then call it with the arguments.&lt;/p&gt;

&lt;p&gt;When we have the result of the function call we pass it back to the model so that it can choose what to do next. We do so by creating a new message with the type "conversation.item.create" and within that message we include an item with the type "function_call_output", the output of the function call, and the ID that the original event had, so that the model can tie the response to the original query.&lt;/p&gt;

&lt;p&gt;We send this to the model as well as another message with the type "response.create" which requests the model use this new information to return a new response.&lt;/p&gt;

&lt;p&gt;Overall, this enables the model to request to use the database search function we defined and provide the arguments it wants to call the function with. We are then responsible for calling the function and returning the results to the model. The whole code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;      &lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;LOG_EVENT_TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Received event: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;response.done&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;outputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
              &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;functionCall&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function_call&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
              &lt;span class="p"&gt;);&lt;/span&gt;
              &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;
                  &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;conversationItemCreate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;conversation.item.create&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="na"&gt;item&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function_call_output&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;call_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;call_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;};&lt;/span&gt;
                &lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;conversationItemCreate&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
                &lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;response.create&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
              &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="c1"&gt;// other event handlers&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start the application and make sure it is connected to your Twilio number as described in the &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;Twilio blog post&lt;/a&gt;. Now we can call and chat all things Taylor Swift.&lt;/p&gt;

&lt;p&gt;If you want to try this out with my assistant, you can give it a call on (855) 687-9438.&lt;/p&gt;

&lt;p&gt;This is now a new way to connect with the &lt;a href="https://www.datastax.com/blog/using-astradb-vector-to-build-taylor-swift-chatbot?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;Taylor Swift bot we built&lt;/a&gt; a while back. So now you can &lt;a href="https://www.tswift.ai/" rel="noopener noreferrer"&gt;chat with SwiftieGPT online&lt;/a&gt; or on the phone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Give your voice assistants some agency
&lt;/h2&gt;

&lt;p&gt;Real-time voice agents are very cool, but they have all the same drawbacks as a plain LLM. In this post we added agentic RAG capabilities to our voice agent and it was able to use up-to-date knowledge to answer our questions about Taylor Swift.&lt;/p&gt;

&lt;p&gt;When you provide a voice agent with tools, like context from a vector database, the results are very impressive. The combination of Twilio, OpenAI, and Astra DB creates a very powerful agent.&lt;/p&gt;

&lt;p&gt;You can find the code to this in my &lt;a href="https://github.com/philnash/speech-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;fork of the Twilio project&lt;/a&gt;. You don't have to stop here though; you can define and add further tools to the agent. Make sure you check out &lt;a href="https://platform.openai.com/docs/guides/function-calling#best-practices-for-defining-functions" rel="noopener noreferrer"&gt;OpenAI's best practices for defining functions for your models&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you're interested in building other agents, check out &lt;a href="https://www.datastax.com/blog/build-simple-ai-agent-with-langflow-composio?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;how to work with Langflow and Composio&lt;/a&gt; or the &lt;a href="https://www.youtube.com/watch?v=mn1ZnlqnQlg" rel="noopener noreferrer"&gt;workshop and videos from the recent Hacking Agents event&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Are you excited about voice agents or agentic RAG? Come chat about it and what you're building in the &lt;a href="https://discord.gg/datastax" rel="noopener noreferrer"&gt;DataStax Devs Discord&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Want to roll up your sleeves and build with OpenAI, Twilio, Cloudflare, Unstructured, and DataStax? &lt;a href="https://lu.ma/hacking-agents-hackathon" rel="noopener noreferrer"&gt;Join us on Feb. 28 in San Francisco for the Hacking Agents Hackathon&lt;/a&gt;, an epic 24-hour hackathon where we'll be diving into what developers can build with the latest and greatest in AI tooling.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>twilio</category>
      <category>openai</category>
      <category>ai</category>
    </item>
    <item>
      <title>How to Use the Langflow API in Node.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Tue, 28 Jan 2025 17:27:05 +0000</pubDate>
      <link>https://forem.com/datastax/how-to-use-the-langflow-api-in-nodejs-2a12</link>
      <guid>https://forem.com/datastax/how-to-use-the-langflow-api-in-nodejs-2a12</guid>
      <description>&lt;p&gt;&lt;a href="https://www.langflow.org/" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt; is a fantastic low-code tool for &lt;a href="https://www.youtube.com/watch?v=LuJ_FM1l1OA" rel="noopener noreferrer"&gt;building generative AI flows and agents&lt;/a&gt;. Once you've built your flow, it’s time to integrate it into your own application using the &lt;a href="https://docs.langflow.org/concepts-api" rel="noopener noreferrer"&gt;Langflow API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In Node.js applications, you can construct and make calls directly to the API with &lt;a href="https://nodejs.org/api/globals.html#fetch" rel="noopener noreferrer"&gt;fetch&lt;/a&gt;, the &lt;a href="https://nodejs.org/api/http.html" rel="noopener noreferrer"&gt;http module&lt;/a&gt;, or using your favourite HTTP client like &lt;a href="https://www.npmjs.com/package/axios" rel="noopener noreferrer"&gt;axios&lt;/a&gt; or &lt;a href="https://www.npmjs.com/package/got" rel="noopener noreferrer"&gt;got&lt;/a&gt;. To make it easier, you can now use this &lt;a href="https://www.npmjs.com/package/@datastax/langflow-client" rel="noopener noreferrer"&gt;JavaScript Langflow client&lt;/a&gt;. Let's take a look at how it works.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy6fimmktd8fpk20q64p8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy6fimmktd8fpk20q64p8.png" alt="Example code for using the JavaScript Langflow client. It reads:  import { LangflowClient } from '@datastax/langflow-client';  const client = new LangflowClient({ langflowId, apiKey }); const flow = client.flow(flowId); const response = await flow.run(" width="800" height="439"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What you'll need
&lt;/h2&gt;

&lt;p&gt;You can use the JavaScript Langflow client with either the &lt;a href="https://www.langflow.org/" rel="noopener noreferrer"&gt;open-source, self-hosted version of Langflow&lt;/a&gt; or the &lt;a href="https://www.datastax.com/products/langflow?utm_medium=byline&amp;amp;utm_campaign=use-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;DataStax cloud-hosted version of Langflow&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Note: this Langflow client is for using on the server. The Langflow API uses API keys, which should not be exposed, so it isn’t suitable for using directly from the front-end.&lt;/p&gt;

&lt;p&gt;To test the client out, you’ll either need to &lt;a href="https://github.com/langflow-ai/langflow?tab=readme-ov-file#-quickstart" rel="noopener noreferrer"&gt;host your own version of Langflow&lt;/a&gt; or &lt;a href="https://astra.datastax.com/signup?type=langflow&amp;amp;utm_medium=byline&amp;amp;utm_campaign=use-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;sign up for a free DataStax account and use the cloud-hosted version&lt;/a&gt;. Once you are set up with Langflow, make sure you have a flow to test the API out with. The &lt;a href="https://docs.langflow.org/starter-projects-basic-prompting" rel="noopener noreferrer"&gt;basic prompting flow template&lt;/a&gt; is a good start, or, if you're looking for something with a bit more agency, check out the &lt;a href="https://docs.langflow.org/starter-projects-simple-agent" rel="noopener noreferrer"&gt;simple agent template&lt;/a&gt;. You'll need an &lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt;OpenAI API key&lt;/a&gt; to run these flows, or you can change out the model provider if you want to. Make sure the flow works with a test in the playground; once it’s responding, you’re ready to make calls to the API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started with the JavaScript Langflow client
&lt;/h2&gt;

&lt;p&gt;To demonstrate how to use the Langflow client, let's start a small TypeScript application. Create a new directory, change into it, and initialize a new Node.js project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;using-langflow-client
&lt;span class="nb"&gt;cd &lt;/span&gt;using-langflow-client
npm init &lt;span class="nt"&gt;--yes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install the client using your favourite package manager:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @datastax/langflow-client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install some other tools that will help us write and run the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;tsx @types/node &lt;span class="nt"&gt;--save-dev&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a new file called &lt;em&gt;index.ts&lt;/em&gt; and open it in your editor of choice. Start by importing the client.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LangflowClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/langflow-client&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can initialize a client to use with the Langflow API. How you do this depends on whether you’re using DataStax-hosted Langflow or self-hosted Langflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Initializing for DataStax-hosted Langflow
&lt;/h3&gt;

&lt;p&gt;When you use DataStax-hosted Langflow, you’ll need your Langflow ID and an API key. You can get both from the API modal that you can access from the Langflow canvas.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fje53twtd8upwj3n9mygf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fje53twtd8upwj3n9mygf.png" alt="The API modal is accessed by clicking the API button located in the top right of the canvas." width="800" height="541"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Langflow ID is in the API URL and you can generate an API key, too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foyaqmlyd8dlcd0xmp8vt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foyaqmlyd8dlcd0xmp8vt.png" alt='In the API modal there is a button that says "Generate Token" which you can use to generate an API token. You can also find your Langflow ID in the API URL between /lf/ and /api/.' width="800" height="541"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can then create a client with the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;langflowId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_LANGFLOW_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LangflowClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;langflowId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Initializing for self-hosted Langflow
&lt;/h3&gt;

&lt;p&gt;If you’re self-hosting Langflow, or just running it locally, you’ll need the URL from which you access Langflow.&lt;/p&gt;

&lt;p&gt;If you have set up authentication for your instance of Langflow, you’ll need to &lt;a href="https://docs.langflow.org/configuration-api-keys" rel="noopener noreferrer"&gt;create an API key for your user&lt;/a&gt;. If you haven't yet set up authentication for your instance of Langflow, you can omit the API key.&lt;/p&gt;

&lt;p&gt;You can then initialize the client like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseURL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://localhost:7860&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LangflowClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running a flow
&lt;/h3&gt;

&lt;p&gt;No matter which way you initialized your client, you can now use it to run your flows. To do so, you’ll need the flow ID, which can be found in the API modal in the flow canvas.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fres627jtmsp83ett7x0y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fres627jtmsp83ett7x0y.png" alt="Your flow ID can be found in the API modal in the API URL after /run/." width="800" height="541"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can get a reference to a flow by calling on the client like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flowId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_FLOW_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flowId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can run the flow by calling &lt;code&gt;run&lt;/code&gt; and passing it the input to your flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you run the application now, your flow will run and output your results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx tsx ./index.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0r7191uzcm81imblp7mk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0r7191uzcm81imblp7mk.png" alt="A screenshot of the example code in VS Code, with the integrated terminal open at the bottom of the screen having run the example. The response says, " start="" width="800" height="458"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Flow responses
&lt;/h3&gt;

&lt;p&gt;Flows return a lot of data: everything you could want to know about how the flow ran. The most important part of the response is the output from the flow; the Langflow client tries to make this easy.&lt;/p&gt;

&lt;p&gt;You can take the flow response from above and instead of logging the entire set of response outputs, you can call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chatOutputText&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The client will return the text from the first chat output component in the response.&lt;/p&gt;

&lt;p&gt;If you need the session ID, or more detail from any of the outputs, you can access the full response from the FlowResponse object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Options for running a flow
&lt;/h3&gt;

&lt;p&gt;Using &lt;code&gt;flow.run(input)&lt;/code&gt; will run your flow with several defaults. The input and output types will be set to chat and it’ll use the &lt;a href="https://docs.langflow.org/concepts-playground#use-custom-session-ids-for-multiple-user-interactions" rel="noopener noreferrer"&gt;default session&lt;/a&gt;. If your flow requires different settings, you can update the parameters. For example, if you want to set the input and output types to text and pass a session ID, you can do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;InputTypes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;OutputTypes&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/langflow-client/consts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// set up flow as above&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;input_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;InputTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;output_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;OutputTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;USER_SESSION_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tweaks
&lt;/h3&gt;

&lt;p&gt;Langflow is flexible enough to enable you to change the settings for any of the components in a flow. For example, you might have set up the flow to use the OpenAI model component using the gpt-4o-mini model, but you want to test the flow with gpt-4o. Instead of updating the flow itself, you can send a tweak by providing the ID of the component and the parameters you want to override.&lt;/p&gt;

&lt;p&gt;The JavaScript Langflow client supports tweaks in a couple of ways.&lt;/p&gt;

&lt;p&gt;You can add a tweak to a flow object, like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flowId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tweakedFlow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tweak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OpenAIModel-KqkTB&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a new flow object, so if you call &lt;code&gt;run&lt;/code&gt; on the original &lt;code&gt;flow&lt;/code&gt; object it will use the original model and if you call run on the tweakedFlow object it will use gpt-4o.&lt;/p&gt;

&lt;p&gt;You can also provide your tweaks as an object when you run the flow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tweaks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OpenAIModel-KqkTB&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;model_name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}};&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;tweaks&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Let's make this better together
&lt;/h2&gt;

&lt;p&gt;This is the first release of this Langflow client and we want it to be the easiest way for you to use Langflow in your JavaScript server-side applications. The code is open-source and &lt;a href="https://github.com/datastax/langflow-client-ts/" rel="noopener noreferrer"&gt;available on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you have feedback, suggestions, or you want to contribute, please do so over on GitHub. And if you like the library, please leave a star on the GitHub repo.&lt;/p&gt;

</description>
      <category>node</category>
      <category>langflow</category>
      <category>genai</category>
    </item>
    <item>
      <title>Clean up HTML Content for Retrieval-Augmented Generation with Readability.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Tue, 21 Jan 2025 18:22:44 +0000</pubDate>
      <link>https://forem.com/datastax/clean-up-html-content-for-retrieval-augmented-generation-with-readabilityjs-5gmg</link>
      <guid>https://forem.com/datastax/clean-up-html-content-for-retrieval-augmented-generation-with-readabilityjs-5gmg</guid>
      <description>&lt;p&gt;Scraping web pages is one way to fetch content for your &lt;a href="https://www.ibm.com/think/topics/retrieval-augmented-generation" rel="noopener noreferrer"&gt;retrieval-augmented generation (RAG)&lt;/a&gt; application. But parsing the content from a web page can be a pain.&lt;/p&gt;

&lt;p&gt;Mozilla's open-source library &lt;a href="https://github.com/mozilla/readability" rel="noopener noreferrer"&gt;Readability.js&lt;/a&gt; is a useful tool for extracting just the important parts of a web page. Let's look at how to use it as part of a data ingestion pipeline for a RAG application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Retrieving unstructured data from a web page
&lt;/h2&gt;

&lt;p&gt;Web pages are a source of unstructured data that we can use in RAG-based apps. But web pages are often full of content that is irrelevant; things like headers, sidebars, and footers. They contain useful context for someone browsing the site, but detract from the main subject of a page.&lt;/p&gt;

&lt;p&gt;To get the best data for RAG, we need to remove irrelevant content. When you’re working within one site, you can use tools like &lt;a href="https://cheerio.js.org/" rel="noopener noreferrer"&gt;cheerio&lt;/a&gt; to parse the HTML yourself based on your knowledge of the site's structure. But if you're scraping pages across different layouts and designs, you need a good way to return just the relevant content and avoid the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Repurposing reader view
&lt;/h2&gt;

&lt;p&gt;Most web browsers come with a reader view that strips out everything but the article title and content. Here is the difference between the browser and reader mode when applied to a blog post on my persona site.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3eae9ravn78va1sm5xt.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3eae9ravn78va1sm5xt.gif" alt="An animation that shows browsing a blog post on my personal site You can see the content, but also the header navigation and the sidebar, which are irrelevant for RAG. Clicking on reader mode strips away the site design and all the extra content leaving just the article." width="756" height="490"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Mozilla makes the underlying library for Firefox's reader mode available as a standalone open-source module: &lt;a href="https://github.com/mozilla/readability" rel="noopener noreferrer"&gt;Readability.js&lt;/a&gt;. So we can use Readability.js in a data pipeline to strip irrelevant content and return high quality results from scraping a web page.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to scrape data with Node.js and Readability.js
&lt;/h2&gt;

&lt;p&gt;Let's take a look at an example of scraping the article content from my previous blog post on &lt;a href="https://philna.sh/blog/2024/09/25/how-to-create-vector-embeddings-in-node-js/" rel="noopener noreferrer"&gt;creating vector embeddings in Node.js&lt;/a&gt;. Here's some JavaScript you can use to retrieve the HTML for the page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="nx"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//philna.sh/blog/2024/09/25/how-to-create-vector-embeddings-in-node-js/""&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This includes all the HTML tags as well as the navigation, footer, share links, calls to action and other things you can find on most web sites.&lt;/p&gt;

&lt;p&gt;To improve on this, you could install a module like &lt;a href="https://cheerio.js.org/" rel="noopener noreferrer"&gt;cheerio&lt;/a&gt; and select only the important parts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;cheerio
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;cheerio&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cheerio&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://philna.sh/blog/2024/09/25/how-to-create-vector-embeddings-in-node-js/&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;$&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;cheerio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;h1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;section#blog-content &amp;gt; div:first-child&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this code you get the title and text of the article. As I said earlier, this is great if you know the structure of the HTML, but that won't always be the case.&lt;/p&gt;

&lt;p&gt;Instead, install &lt;a href="https://github.com/mozilla/readability" rel="noopener noreferrer"&gt;Readability.js&lt;/a&gt; and &lt;a href="https://github.com/jsdom/jsdom" rel="noopener noreferrer"&gt;jsdom&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @mozilla/readability jsdom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Readability.js normally runs in a browser environment and uses the live document rather than a string of HTML, so we need to include jsdom to provide that in Node.js. Now we can turn the HTML we already loaded into a document and pass it to Readability.js to parse out the content.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Readability&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@mozilla/readability&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;JSDOM&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;jsdom&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://philna.sh/blog/2024/09/25/how-to-create-vector-embeddings-in-node-js/&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSDOM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you inspect the article, you can see that it has parsed a number of things from the HTML.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fllgkxfgr1wdykg47mlcs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fllgkxfgr1wdykg47mlcs.png" alt="The object that Readability.js produces. It has properties for the title, byline, dir, lang, content, textContent, length, excerpt, siteName, and publishedTime." width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There's the title, author, excerpt, publish time, and both the content and textContent. The textContent property is the plain text content of the article, ready for you to &lt;a href="https://philna.sh/blog/2024/09/18/how-to-chunk-text-in-javascript-for-rag-applications/" rel="noopener noreferrer"&gt;split into chunks&lt;/a&gt;, &lt;a href=""&gt;create vector embeddings&lt;/a&gt;, and ingest into a vector database. The content property is the original HTML, including links and images. This could be useful if you want to extract links or process the images somehow.&lt;/p&gt;

&lt;p&gt;You might also want to see whether the document is likely to return good results. Reader view works well on articles, but is less useful for other types of content. You can do a quick check to see if the HTML is suitable for processing with Readability.js with the function &lt;code&gt;isProbablyReaderable&lt;/code&gt;. If this function returns false you may want to parse the HTML in a different way, or even inspect the URL to see whether it has useful content for you.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSDOM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;isProbablyReaderable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// do something else&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the page fails this check, you might want to flag the URL to see whether it does include useful information for your RAG application, or whether it should be excluded.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Readability with LangChain.js
&lt;/h2&gt;

&lt;p&gt;If you're using &lt;a href="https://js.langchain.com/" rel="noopener noreferrer"&gt;LangChain.js&lt;/a&gt; for your application, you can also use Readability.js to return the content from an HTML page. It fits nicely into your data ingestion pipelines, working with other LangChain components, like &lt;a href="https://philna.sh/blog/2024/09/18/how-to-chunk-text-in-javascript-for-rag-applications/" rel="noopener noreferrer"&gt;text chunkers&lt;/a&gt; and &lt;a href="https://docs.datastax.com/en/astra-db-serverless/integrations/langchain-js.html?utm_medium=byline&amp;amp;utm_campaign=html-content-retrieval-augmented-generation-readability-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;vector stores&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The following example uses LangChain.js to load the same page as above, return the relevant content from the page using the &lt;a href="https://js.langchain.com/docs/integrations/document_transformers/mozilla_readability/" rel="noopener noreferrer"&gt;&lt;code&gt;MozillaReadabilityTransformer&lt;/code&gt;&lt;/a&gt;, split the text into chunks using the &lt;a href="https://js.langchain.com/docs/how_to/recursive_text_splitter/" rel="noopener noreferrer"&gt;&lt;code&gt;RecursiveCharacterTextSplitter&lt;/code&gt;&lt;/a&gt;, create vector embeddings with OpenAI, and store the data in &lt;a href="https://docs.datastax.com/en/astra-db-serverless/integrations/langchain-js.html?utm_medium=byline&amp;amp;utm_campaign=html-content-retrieval-augmented-generation-readability-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You'll need to install the following dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @langchain/core @langchain/community @langchain/openai @datastax/astra-db-ts @mozilla/readability jsdom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run the example, you will need to create an Astra DB database and store the database's endpoint and application token in your environment as &lt;code&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/code&gt; and &lt;code&gt;ASTRA_DB_API_ENDPOINT&lt;/code&gt;. You will also need an OpenAI API key stored in your environment as &lt;code&gt;OPENAI_API_KEY&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Import the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;HTMLWebBaseLoader&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/community/document_loaders/web/html&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;MozillaReadabilityTransformer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/community/document_transformers/mozilla_readability&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;RecursiveCharacterTextSplitter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/textsplitters&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OpenAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AstraDBVectorStore&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/community/vectorstores/astradb&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We use the &lt;code&gt;HTMLWebBaseLoader&lt;/code&gt; to load the raw HTML from the URL we provide. The HTML is then passed through the &lt;code&gt;MozillaReadabilityTransformer&lt;/code&gt; to extract the text, which is then split into chunks by the &lt;code&gt;RecursiveCharacterTextSplitter&lt;/code&gt;. Finally, we create an embedding provider and an Astra DB vector store that will be used to turn the text chunks into vector embeddings and store them in the vector database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HTMLWebBaseLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://philna.sh/blog/2024/09/25/how-to-create-vector-embeddings-in-node-js/&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transformer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;MozillaReadabilityTransformer&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;maxCharacterCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;chunkOverlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AstraDBVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;content&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;collectionOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;dimension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;metric&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cosine&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The initialisation of all the components makes up most of the work. Once everything is set up, you can load, transform, split, embed and store the documents like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;transformer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vectorizedDocs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;sequence&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;vectorizedDocs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  More accurate data from web scraping with Readability.js
&lt;/h2&gt;

&lt;p&gt;Readability.js is a battle-tested library powering Firefox's reader mode that we can use to scrape only relevant data from web pages. This cleans up web content and makes it much more useful for RAG.&lt;/p&gt;

&lt;p&gt;As we've seen, you can do this directly with the library or using LangChain.js and the &lt;code&gt;MozillaReadabilityTransformer&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Getting data from a web page is only the first step in your ingestion pipeline. From here you'll need to split your text into chunks, create vector embeddings, and store everything in Astra DB. Then you'll be ready to build your RAG-powered application.&lt;/p&gt;

</description>
      <category>node</category>
      <category>rag</category>
      <category>javascript</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
