<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Tim Kelly</title>
    <description>The latest articles on Forem by Tim Kelly (@timkelly).</description>
    <link>https://forem.com/timkelly</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2841795%2Ff01f27f2-72c0-45cb-b0c2-d74b1a332545.jpeg</url>
      <title>Forem: Tim Kelly</title>
      <link>https://forem.com/timkelly</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/timkelly"/>
    <language>en</language>
    <item>
      <title>Build a Spring AI MCP Server With MongoDB</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Tue, 11 Nov 2025 13:54:18 +0000</pubDate>
      <link>https://forem.com/mongodb/build-a-spring-ai-mcp-server-with-mongodb-1ebd</link>
      <guid>https://forem.com/mongodb/build-a-spring-ai-mcp-server-with-mongodb-1ebd</guid>
      <description>&lt;p&gt;In this tutorial, we're going to build a Model Context Protocol (MCP) server that connects AI models to MongoDB, using it to power a simple todo list application. If you've been working with AI applications, you've probably run into the challenge of getting LLMs to interact with your actual systems and data. That's exactly what the Model Context Protocol solves! It's quickly becoming the standard way to connect AI models to the real world. Whether you're building agents that need to query databases, access APIs, or interact with your company's internal tools, understanding MCP is becoming essential. By the end of this tutorial, you'll have a working server that exposes MongoDB operations as tools any MCP-compatible AI can use, and you'll understand the patterns for building your own servers for whatever data sources you need to connect.&lt;/p&gt;

&lt;p&gt;MongoDB is a natural fit for AI applications—it's flexible, scalable, and has built-in vector search capabilities that make it easy to work with embeddings and semantic search. If you're curious about the AI side of MongoDB and want to dive deeper into vector search, there's a &lt;a href="https://learn.mongodb.com/courses/vector-search-fundamentals?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=spring+ai+mcp&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Vector Search Fundamentals course&lt;/a&gt; that'll walk you through those concepts (and you can earn a skill badge while you're at it).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi7u7kydc2awkt6gwz3en.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi7u7kydc2awkt6gwz3en.png" alt="MongoDB Vector Fundamental badge" width="622" height="264"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you want to take a look at the code used in this tutorial, check out the &lt;a href="https://github.com/mongodb-developer/springai-mcp" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the Model Context Protocol?
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener noreferrer"&gt;Model Context Protocol (MCP)&lt;/a&gt; is an open standard that enables AI applications to securely connect to various data sources and tools. Think of it as a universal adapter that lets your AI models interact with databases, APIs, file systems, and other services in a standardized way. Instead of building custom integrations for every data source, MCP provides a common language that both AI systems and data providers can speak.&lt;/p&gt;

&lt;p&gt;At its core, MCP defines how servers expose their capabilities (like database queries or file operations) as tools, resources, and prompts that AI models can discover and use. This makes it incredibly powerful for building context-aware AI applications that need to pull from multiple sources.&lt;/p&gt;

&lt;h2&gt;
  
  
  The difference between an MCP server and an MCP client
&lt;/h2&gt;

&lt;p&gt;An &lt;strong&gt;MCP server&lt;/strong&gt; is the component that exposes capabilities to AI systems. It's what we'll be building in this tutorial. The server defines tools (functions the AI can call), resources (data the AI can access), and prompts (templated conversations). When we build our Spring AI MCP server, we're creating something that says, "Hey, I can interact with MongoDB for you—here are the operations I support."&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;MCP client&lt;/strong&gt;, on the other hand, is the component that consumes these capabilities. This is typically your AI application or agent that discovers available tools from MCP servers and decides when to use them. The client sends requests to servers and handles the responses. In the broader ecosystem, applications like Claude Desktop or other AI interfaces act as MCP clients that can connect to your servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Spring AI?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://spring.io/projects/spring-ai" rel="noopener noreferrer"&gt;Spring AI&lt;/a&gt; is an application framework for AI engineering. Its goal is to apply to the AI domain Spring ecosystem design principles such as portability and modular design and promote using POJOs as the building blocks of an application to the AI domain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Java 17+
&lt;/li&gt;
&lt;li&gt;Maven
&lt;/li&gt;
&lt;li&gt;A &lt;a href="https://account.mongodb.com/account/register?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=spring+ai+mcp&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas account&lt;/a&gt; with a cluster set up

&lt;ul&gt;
&lt;li&gt;An M0 free forever tier is perfect
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Node (version 22.7.5 or higher is recommended)

&lt;ul&gt;
&lt;li&gt;We need this as we will be using the &lt;a href="https://modelcontextprotocol.io/docs/tools/inspector" rel="noopener noreferrer"&gt;MCP Inspector&lt;/a&gt; to test our application&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Build a Spring app and our dependencies
&lt;/h2&gt;

&lt;p&gt;To start building our application, we will need to bring in a few dependencies. The easiest way to get started is using &lt;a href="https://start.spring.io/" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt;. We are going to need the &lt;strong&gt;Model Context Protocol Server&lt;/strong&gt; and the &lt;strong&gt;Spring Web&lt;/strong&gt; dependencies. We'll be adding the &lt;a href="https://spring.io/projects/spring-data-mongodb" rel="noopener noreferrer"&gt;&lt;strong&gt;Spring Data MongoDB&lt;/strong&gt;&lt;/a&gt; dependency later on, but we'll keep it to just the first two for now to make getting started easier.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5mdh446khizg9qa037er.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5mdh446khizg9qa037er.png" alt="Spring iniatilzr configuration" width="800" height="472"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For Project, I have selected Maven; for language, Java; and Spring Boot version 3.5.7 (or the latest stable release). Now, we can generate our Jar and open the downloaded app in our IDE.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure our MCP server
&lt;/h3&gt;

&lt;p&gt;First things first, we need to update our Spring AI version to the latest milestone &lt;code&gt;1.1.0-M3&lt;/code&gt;. It is possible to do all this with the latest stable release, but the newer interface provides a much cleaner setup.&lt;/p&gt;

&lt;p&gt;Open the &lt;code&gt;POM.XML&lt;/code&gt; and update the Spring AI version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;spring-ai.version&amp;gt;&lt;/span&gt;1.1.0-M3&lt;span class="nt"&gt;&amp;lt;/spring-ai.version&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we can define our MCP details in the &lt;code&gt;application.properties&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.application.name=springai-mcp
spring.ai.mcp.server.name=mongodb-mcp
spring.ai.mcp.server.version=1.0.0

spring.ai.mcp.server.protocol=streamable
spring.ai.mcp.server.stdio=false
spring.ai.mcp.server.type=sync
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We give our server a name and version—this helps clients identify what they're connecting to. The name &lt;code&gt;mongodb-mcp&lt;/code&gt; tells clients this server provides MongoDB capabilities, and the version lets us track updates over time.&lt;/p&gt;

&lt;p&gt;We want an HTTP streamable service, no stdio, and synchronous responses. This means our server will communicate over HTTP rather than standard input/output, and it'll handle requests synchronously rather than asynchronously. This keeps things simple for our tutorial, though async operations are definitely an option for production use cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Build your first MCP tool
&lt;/h2&gt;

&lt;p&gt;An MCP tool is essentially a function that an AI model can call. When you annotate a method with &lt;code&gt;@McpTool&lt;/code&gt;, you're telling the MCP framework, "This is something an AI can use." The AI will be able to see the tool's name, description, and parameters, then decide whether to call it based on what the user is asking for.&lt;/p&gt;

&lt;p&gt;Create a class &lt;code&gt;MongoDbTools.java&lt;/code&gt;, where we will create a &lt;code&gt;@Component&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We will use the &lt;code&gt;@McpTool&lt;/code&gt; annotation. The &lt;code&gt;name&lt;/code&gt; we give it will be how the AI identifies and calls this tool—think of it as the function name the AI will invoke. The &lt;code&gt;description&lt;/code&gt; is important as it tells the AI what this tool does and when to use it. A good description helps the AI make smart decisions about when to call your tool. Think of this like a prompt template. When we give our application a natural language prompt, these descriptions allow the LLM to discern which tool is best for the job, and orchestrate the tools to achieve what we are asking it to do.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springaimcp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springaicommunity.mcp.annotation.McpTool&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Component&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Component&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MongoDbTools&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="c1"&gt;// Tools&lt;/span&gt;
    &lt;span class="nd"&gt;@McpTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"get-todo-items"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"I will return the items of a todo list"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getTodoItems&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;todo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""
                1. Make Spring AI MCP tutorial

                2. Have a coffee

                3. Walk the dog
                """&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;todo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where we would define &lt;code&gt;@McpResources&lt;/code&gt; or &lt;code&gt;@McpPrompts&lt;/code&gt;. The &lt;code&gt;@McpResource&lt;/code&gt; annotation provides access to resources via URI templates. The &lt;code&gt;@McpPrompt&lt;/code&gt; annotation generates prompt messages for AI interactions. These go beyond what we cover in this tutorial, but you can read about them in the &lt;a href="https://docs.spring.io/spring-ai/reference/1.1/api/mcp/mcp-annotations-server.html" rel="noopener noreferrer"&gt;Spring AI MCP documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How we will test our MCP server
&lt;/h2&gt;

&lt;p&gt;First, we need to run our application.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn spring-boot:run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To test our application, we will use the &lt;a href="https://github.com/modelcontextprotocol/inspector" rel="noopener noreferrer"&gt;MCP Inspector&lt;/a&gt;. This is an interactive developer tool for testing and debugging MCP servers.&lt;/p&gt;

&lt;p&gt;If you want to read more about it, check out the &lt;a href="https://modelcontextprotocol.io/legacy/tools/debugging" rel="noopener noreferrer"&gt;Debugging guide&lt;/a&gt;, which covers the Inspector as part of the overall debugging toolkit. For this tutorial, all we are going to do is use it to make sure our application is up and running, exposing the tools we create, and testing those tools.&lt;/p&gt;

&lt;p&gt;The Inspector runs with the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @modelcontextprotocol/inspector
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this is your first time running it, you will likely need to install it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Need to &lt;span class="nb"&gt;install &lt;/span&gt;the following packages:
@modelcontextprotocol/inspector@0.17.2
Ok to proceed? &lt;span class="o"&gt;(&lt;/span&gt;y&lt;span class="o"&gt;)&lt;/span&gt; y
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this spins up, it will open in the browser. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkudyajl3a5727pdyy17b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkudyajl3a5727pdyy17b.png" alt="MCP Inspector default screen" width="800" height="539"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, for the transport type in the top left, we need to change this to &lt;code&gt;Streamable HTTP&lt;/code&gt;. This will change the interface slightly, giving us the option to define the URL our app is running on. Mine is localhost:8080, so I will enter &lt;code&gt;http://localhost:8080/mcp&lt;/code&gt;. The &lt;code&gt;/mcp&lt;/code&gt; is important to add—this is the default endpoint where Spring AI MCP exposes the protocol.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F891o32ujvfailqqoceow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F891o32ujvfailqqoceow.png" alt="MCP Inspector configuration" width="764" height="1562"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At this point, we can just hit connect. We will receive some server logging in the history section. It will confirm the name of the server, update when tools change, all that good stuff.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbixz9ef09hc36a2ph89b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbixz9ef09hc36a2ph89b.png" alt="Inspector connected to MCP" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the top bar we can select tools, and here we can list all the tools we have available. If we select a tool, we can then choose to run it in the browser. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsriqg3gcxks0io54ox0q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsriqg3gcxks0io54ox0q.png" alt="Tools section" width="800" height="326"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If everything is working correctly, we should be able to confirm our output. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fle81edmcg6jua6nekegt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fle81edmcg6jua6nekegt.png" alt="Demo tool output" width="800" height="738"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Success! Now that we have this confirmed, we can look to add some functionality to our todo list by connecting to MongoDB as our database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connect to MongoDB
&lt;/h2&gt;

&lt;p&gt;Add the Spring Data MongoDB dependency to your &lt;code&gt;pom.xml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;  
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;  
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-data-mongodb&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;  
&lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add connection details to the &lt;code&gt;application.properties&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#...

# MongoDB Configuration
# 1. Copy your connection string from MongoDB Atlas
# 2. Replace ?appName=Cluster with ?appName=SpringAiMcp, or add it if no appName is present
#
# Before: mongodb+srv://user:pass@cluster.abc12.mongodb.net/?appName=Cluster
# After:  mongodb+srv://user:pass@cluster.abc12.mongodb.net/?appName=SpringAiMcp

spring.data.mongodb.uri=&amp;lt;YOUR-CONNECTION-STRING&amp;gt;
spring.data.mongodb.database=todo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For our connection string, it is best practice to set the app name so we can see clearly what queries or operations are tied to which apps. For this, set the &lt;code&gt;appName&lt;/code&gt; to &lt;code&gt;SpringAiMcp&lt;/code&gt; by adding it to the end of your connection string. The database name &lt;code&gt;todo&lt;/code&gt; is where we'll store all our task data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick connection check (optional)
&lt;/h3&gt;

&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mvn spring-boot:run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look for a line similar to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Monitor thread successfully connected to server with description ServerDescription{address=...}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you see authentication or timeout errors, check:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The connection string credentials.
&lt;/li&gt;
&lt;li&gt;Your IP access list in MongoDB Atlas.
&lt;/li&gt;
&lt;li&gt;That the appName is added correctly at the end of the URI.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Build our MongoDB tool
&lt;/h2&gt;

&lt;p&gt;Now that we have MongoDB connected, it's time to replace our hardcoded todo list with real database operations. We're going to build out a complete set of tools that let an AI manage tasks in MongoDB—adding new tasks, marking them as complete, and retrieving tasks with various filters.&lt;/p&gt;

&lt;p&gt;The beauty of the MCP approach here is that we're not writing AI-specific code. We're just building normal Spring Data repositories and services, then exposing them as tools with a few annotations. The AI doesn't need to know anything about MongoDB connection strings, query syntax, or data modeling—it just needs to know, "Here's a tool called &lt;code&gt;todo-add-task&lt;/code&gt; that takes a task name." Our MCP server handles all the translation between the AI's requests and the actual database operations.&lt;/p&gt;

&lt;p&gt;We'll structure this in layers, following Spring best practices. First, we'll define our data model and repository for database access. Then, we'll create a service layer to handle our business logic. Finally, we'll expose these operations as MCP tools that AI models can discover and use. This keeps our code clean, testable, and easy to extend with new functionality later.&lt;/p&gt;

&lt;p&gt;Let's start by setting up our data model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Define our Task model and repository
&lt;/h3&gt;

&lt;p&gt;First, let's create our &lt;code&gt;Task&lt;/code&gt; model in &lt;code&gt;Task.java&lt;/code&gt;. This represents a single todo item in our MongoDB collection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springaimcp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.annotation.Id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"tasks"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nd"&gt;@Id&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;isCompleted&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCompleted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;completed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, create our repository interface in &lt;code&gt;TodoRepository.java&lt;/code&gt;. Spring Data MongoDB will handle the implementation for us—this is one of the best features of Spring Data. We just define an interface that extends &lt;code&gt;MongoRepository&lt;/code&gt;, and Spring will automatically generate all the common CRUD operations (create, read, update, delete) at runtime. No need to write boilerplate code for basic database operations like &lt;code&gt;findAll()&lt;/code&gt;, &lt;code&gt;save()&lt;/code&gt;, or &lt;code&gt;deleteById()&lt;/code&gt;—they're all provided out of the box.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springaimcp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Repository&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;TodoRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The generic types &lt;code&gt;&amp;lt;Task, ObjectId&amp;gt;&lt;/code&gt; tell Spring what entity we're working with and what type the ID field is. Spring Data MongoDB is smart enough to map our Java objects to BSON documents in MongoDB and back again, handling all the serialization for us.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating our add task tool
&lt;/h2&gt;

&lt;p&gt;Let's create a service layer to handle our business logic. Create &lt;code&gt;TodoService.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springaimcp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TodoService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;TodoRepository&lt;/span&gt; &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TodoService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TodoRepository&lt;/span&gt; &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;todoRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;addTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, update our &lt;code&gt;MongoDbTools.java&lt;/code&gt; class to wire in the service and add our first real tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springaimcp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springaicommunity.mcp.annotation.McpTool&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springaicommunity.mcp.annotation.McpToolParam&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.autoconfigure.condition.ConditionalOnBean&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Component&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Component&lt;/span&gt;  
&lt;span class="nd"&gt;@ConditionalOnBean&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TodoService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MongoDbTools&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;TodoService&lt;/span&gt; &lt;span class="n"&gt;todoService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;MongoDbTools&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TodoService&lt;/span&gt; &lt;span class="n"&gt;todoService&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;todoService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;todoService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@McpTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"todo-add-task"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Add a new to-do task to MongoDB"&lt;/span&gt;  
    &lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;addTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nd"&gt;@McpToolParam&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"The name or description of the new task"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                    &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;  
            &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;  
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="n"&gt;todoService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Task added successfully: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;@McpToolParam&lt;/code&gt; annotation allows us to provide metadata about each parameter. The description helps the AI understand what to pass in, and marking it as required ensures the AI knows this parameter can't be optional. The better we describe our parameters, the better the AI can use our tools correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Update our tasks
&lt;/h2&gt;

&lt;p&gt;Let's add a query method to our &lt;code&gt;TodoRepository.java&lt;/code&gt; so we can find tasks by name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springaimcp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.Update&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; 
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Repository&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;TodoRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nd"&gt;@Update&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"{ '$set' : { 'completed' : ?0 } }"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;updateCompletedByName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is another powerful Spring Data feature—query derivation from method names. Spring Data MongoDB will parse the method name &lt;code&gt;findByName&lt;/code&gt; and automatically generate the MongoDB query for us. It sees "findBy" and knows we want to query, then "Name" tells it which field to match against. At runtime, this becomes a query like &lt;code&gt;db.tasks.find({ name: "some name" })&lt;/code&gt; without us writing any query code. As long as we follow Spring Data's naming conventions, we can create complex queries just by naming our methods correctly—things like &lt;code&gt;findByCompletedTrueAndNameContaining&lt;/code&gt; would work exactly as you'd expect.&lt;/p&gt;

&lt;p&gt;Now, add the &lt;code&gt;completeTask&lt;/code&gt; method to &lt;code&gt;TodoService.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;   &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCompletedByName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateCompletedByName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And add the corresponding tool to &lt;code&gt;MongoDbTools.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;   &lt;span class="nd"&gt;@McpTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"todo-complete-task"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Mark a to-do task as complete by name"&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;completeTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nd"&gt;@McpToolParam&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"The name of the task to mark as complete or incomplete"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
            &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nd"&gt;@McpToolParam&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"The status of the task, either complete(true) or incomplete(false)"&lt;/span&gt;
            &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;todoService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setCompletedByName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Marked task as "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;": "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Creating our get tasks tools
&lt;/h2&gt;

&lt;p&gt;Let's add more query methods to our &lt;code&gt;TodoRepository.java&lt;/code&gt; to support filtering by completion status:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springaimcp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.Update&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Repository&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;TodoRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nd"&gt;@Update&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"{ '$set' : { 'completed' : ?1 } }"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;updateCompletedByName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;findByCompletedTrue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;findByCompletedFalse&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the &lt;code&gt;getTasks&lt;/code&gt; method to &lt;code&gt;TodoService.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getTasks&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isBlank&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equalsIgnoreCase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"all"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;switch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toLowerCase&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s"&gt;"incomplete"&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByCompletedFalse&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s"&gt;"complete"&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByCompletedTrue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;todoRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally, add the tool to &lt;code&gt;MongoDbTools.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@McpTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"todo-get-tasks"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Retrieve to-do list items, optionally filtered by completion status"&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getTasks&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="nd"&gt;@McpToolParam&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Filter tasks by completion status: 'complete', 'incomplete', or 'all' (default: 'all')"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;  
        &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;todoService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTasks&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing our application
&lt;/h2&gt;

&lt;p&gt;Make sure to set your MongoDB connection string before running the application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn spring-boot:run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, if we go back to our MCP Inspector, we can see our new tools listed in the tools section.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F65cezmfi3qt10xvkoxl7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F65cezmfi3qt10xvkoxl7.png" alt="Updated MCP tools displayed in Inspector" width="800" height="499"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can test each one individually—try adding a task, marking it complete, and retrieving tasks with different filters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3ncstgt8hc4akhrpqgf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn3ncstgt8hc4akhrpqgf.png" alt="Add tasks tool displayed in inspector" width="800" height="579"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Inspector will show you the requests being sent and the responses coming back, making it easy to verify everything is working as expected. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3rmt1sr4t1zbjjxpk7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3rmt1sr4t1zbjjxpk7w.png" alt="Inspector get tasks displaying results" width="800" height="872"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And there you have it! You've built a fully functional MCP server that exposes MongoDB operations as tools that AI models can use. This same pattern can be extended to build servers for all kinds of data sources and operations—the possibilities are pretty much endless.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Congrats! You've just built a fully functional MCP server that bridges the gap between AI models and MongoDB. What started as a simple, hardcoded todo list evolved into a real application with persistent storage, CRUD operations, and a clean architecture that any AI client can interact with.&lt;/p&gt;

&lt;p&gt;The power of what you've built goes way beyond just a todo list. You now understand the fundamental pattern for exposing any data source or service to AI models through MCP. The &lt;code&gt;@McpTool&lt;/code&gt;, &lt;code&gt;@McpToolParam&lt;/code&gt;, and the other MCP annotations give you a consistent way to make anything accessible to AI.&lt;/p&gt;

&lt;p&gt;What's really cool is how little AI-specific code we actually wrote. Most of what we built was just normal Spring Boot application code—models, repositories, services. The MCP layer was just a thin wrapper on top that made everything discoverable and callable by AI models. This means you can take existing Spring applications and MCP-enable them without major rewrites.&lt;/p&gt;

&lt;p&gt;From here, you could extend this in tons of directions. Add authentication so different users have their own task lists. Create more complex queries with date filters or priority sorting. Build out &lt;code&gt;@McpResources&lt;/code&gt; to expose collection schemas or statistics. Add &lt;code&gt;@McpPrompts&lt;/code&gt; to help guide AI models on how to best use your tools. The Spring AI MCP framework gives you all the building blocks you need.&lt;/p&gt;

&lt;p&gt;The Model Context Protocol is still relatively new, but it's quickly becoming the standard way to connect AI models to real-world data and systems. Getting comfortable with building MCP servers now puts you ahead of the curve as this ecosystem continues to grow. Whether you're building internal tools for your team, creating services for AI agents, or just experimenting with what's possible when AI can access real data, MCP is the bridge that makes it all work.&lt;/p&gt;

&lt;p&gt;If you found this tutorial useful, check out my other tutorial, &lt;a href="https://dev.to/mongodb/secure-local-rag-with-role-based-access-spring-ai-ollama-mongodb-1gj"&gt;Secure Local RAG With Role-Based Access: Spring AI, Ollama, &amp;amp; MongoDB&lt;/a&gt;.  &lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>spring</category>
      <category>mcp</category>
      <category>ai</category>
    </item>
    <item>
      <title>Build Your First AI Agent With MongoDB and LangChain4j</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Thu, 16 Oct 2025 15:54:23 +0000</pubDate>
      <link>https://forem.com/mongodb/build-your-first-ai-agent-with-mongodb-and-langchain4j-4i5l</link>
      <guid>https://forem.com/mongodb/build-your-first-ai-agent-with-mongodb-and-langchain4j-4i5l</guid>
      <description>&lt;p&gt;AI agents are everywhere right now. We've all heard the pitch: They reason about problems, use tools autonomously, and chain together multiple steps to accomplish goals without constant hand-holding. They're being deployed to book flights, analyze data pipelines, and handle customer support—taking on tasks that previously required either rigid automation or human intervention.&lt;/p&gt;

&lt;p&gt;This tutorial sits as a nice introduction to building your first AI agent with LangChain4j. We'll create a movie recommendation agent that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understands natural language descriptions of plots ("a sci-fi movie about rebels fighting an empire").
&lt;/li&gt;
&lt;li&gt;Searches semantically through a database using vector embeddings.
&lt;/li&gt;
&lt;li&gt;Calls external APIs to fetch real-time streaming availability.
&lt;/li&gt;
&lt;li&gt;Orchestrates these steps autonomously based on the query.
&lt;/li&gt;
&lt;li&gt;Returns clean, conversational answers instead of raw JSON.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This agent uses MongoDB Atlas for vector search, LangChain4j's agentic framework for orchestration, and the &lt;a href="https://api.watchmode.com/" rel="noopener noreferrer"&gt;Watchmode API&lt;/a&gt; for streaming data. By the end, you'll understand not just how to wire up these components, but how agentic systems actually work: how LLMs plan multi-step workflows, how tools share state, and how to give your agent instructions without hardcoding every possible scenario.&lt;/p&gt;

&lt;p&gt;Vector search is what will allow us to search our data with natural language, and if you'd like to learn more and earn a skills badge, check out our &lt;a href="https://learn.mongodb.com/courses/vector-search-fundamentals?team=devrel-content&amp;amp;utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_content=langchain4j%2Bagent&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Vector Search Fundamentals&lt;/a&gt; skills badge.&lt;/p&gt;

&lt;p&gt;If you just want the code, it's available in this &lt;a href="https://github.com/mongodb-developer/movie-agent" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is an AI agent?
&lt;/h2&gt;

&lt;p&gt;An AI agent is a system that can take in information about its environment, decide what to do next, and work towards a goal, with minimum human intervention. Instead of behaving like a static chatbot that only answers questions, an agent can plan, use tools (like databases or APIs), and adapt based on results.&lt;/p&gt;

&lt;p&gt;While the defining traits of an AI agent are the ability to &lt;em&gt;reason&lt;/em&gt; and &lt;em&gt;act&lt;/em&gt;, there’s no single paradigm for how an agent must function. One popular approach is the &lt;a href="https://arxiv.org/pdf/2210.03629" rel="noopener noreferrer"&gt;ReAct framework (Reasoning + Acting)&lt;/a&gt;, where the agent:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Thinks about the problem.
&lt;/li&gt;
&lt;li&gt;Takes an action (e.g., querying a search engine).
&lt;/li&gt;
&lt;li&gt;Observes the result.
&lt;/li&gt;
&lt;li&gt;Repeats until it can deliver a complete answer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In short:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chatbot&lt;/strong&gt;: Answers questions based on what it already knows
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI agent&lt;/strong&gt;: Reasons about what it needs to learn, uses tools to gather that information, and works toward a goal&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For our movie recommendation agent, we'll use a supervisor pattern, where a planning model orchestrates multiple specialized tools to find movies by plot description, look up streaming availability, and return a clean answer to the user.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj375nyufnavnbe5iyq4q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj375nyufnavnbe5iyq4q.png" alt="Application diagram for how our AI agent will work" width="800" height="444"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is LangChain4j?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/langchain4j/langchain4j" rel="noopener noreferrer"&gt;LangChain4j&lt;/a&gt; is an open-source Java library designed to simplify building LLM-powered applications. It provides a unified interface for working with multiple LLM providers (OpenAI, Anthropic, etc.) and vector stores like &lt;a href="https://docs.langchain4j.dev/integrations/embedding-stores/mongodb-atlas" rel="noopener noreferrer"&gt;MongoDB Atlas&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;While the name invites comparison to the Python-based &lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; project, LangChain4j is more of a fusion, drawing inspiration from Haystack, LlamaIndex, and the wider AI community, all while staying laser-focused on the needs of Java developers.&lt;/p&gt;

&lt;p&gt;Development kicked off in early 2023 during the ChatGPT boom, and while the project is still evolving, its core functionality is stable and production-ready. If you're exploring LLM-powered applications in Java, LangChain4j is a pragmatic and actively maintained option.&lt;/p&gt;

&lt;p&gt;For this tutorial, we're specifically using the &lt;a href="https://docs.langchain4j.dev/tutorials/agentic" rel="noopener noreferrer"&gt;&lt;code&gt;langchain4j-agentic&lt;/code&gt;&lt;/a&gt; module, which provides abstractions for building agentic systems, including the supervisor pattern that will orchestrate our movie search workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before we get started, here's what you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Java 8 or later&lt;/strong&gt; installed and ready to go (I'm using Java 24)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maven&lt;/strong&gt; for building the project (version 3.9.10 or later recommended)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A MongoDB Atlas cluster&lt;/strong&gt;—a free M0 tier is perfect for this tutorial. If you need help setting one up, check out the &lt;a href="https://www.mongodb.com/docs/atlas/getting-started/?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=langchain4j+agent&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Get Started with Atlas&lt;/a&gt; guide.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API keys&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Voyage AI&lt;/strong&gt; for generating embeddings (&lt;a href="https://www.voyageai.com/" rel="noopener noreferrer"&gt;sign up here&lt;/a&gt;)
&lt;/li&gt;
&lt;li&gt;You can use the OpenAI API for both the embeddings and chat model, if you prefer.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI&lt;/strong&gt; for the planning model (&lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt;get your key here&lt;/a&gt;)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watchmode&lt;/strong&gt; for streaming availability data (&lt;a href="https://api.watchmode.com/" rel="noopener noreferrer"&gt;register here&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Make sure your IP address is whitelisted in your MongoDB Atlas cluster's network access settings, and create a database user with read/write permissions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Our dataset
&lt;/h3&gt;

&lt;p&gt;For this tutorial, we're using the &lt;a href="https://www.kaggle.com/datasets/harshitshankhdhar/imdb-dataset-of-top-1000-movies-and-tv-shows" rel="noopener noreferrer"&gt;IMDB Top 1000 Movies dataset from Kaggle&lt;/a&gt;. Download the CSV file and place it in your project's &lt;code&gt;src/main/resources&lt;/code&gt; directory, naming it &lt;code&gt;imdb_top_1000.csv&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This dataset includes movie titles, plot overviews, genres, directors, and IMDB ratings. We'll be embedding the &lt;code&gt;overview&lt;/code&gt; field (the plot description) so users can search for movies semantically—e.g., "Find me a sci-fi movie about rebels fighting an empire"—instead of needing to remember the exact title.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating our app
&lt;/h2&gt;

&lt;p&gt;Let's scaffold a new Maven project. Create a &lt;code&gt;pom.xml&lt;/code&gt; file with the following dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
    &lt;span class="nt"&gt;&amp;lt;dependencies&amp;gt;&lt;/span&gt;
        &lt;span class="c"&gt;&amp;lt;!-- LangChain4j core --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;dev.langchain4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;langchain4j&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

        &lt;span class="c"&gt;&amp;lt;!-- LangChain4j agentic module --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;dev.langchain4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;langchain4j-agentic&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.0-beta11&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

        &lt;span class="c"&gt;&amp;lt;!-- OpenAI integration for the planning model --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;dev.langchain4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;langchain4j-open-ai&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.4.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

        &lt;span class="c"&gt;&amp;lt;!-- MongoDB Atlas integration --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;dev.langchain4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;langchain4j-mongodb-atlas&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.0-beta11&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

        &lt;span class="c"&gt;&amp;lt;!-- Voyage AI for embeddings --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;dev.langchain4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;langchain4j-voyage-ai&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.0-beta11&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

        &lt;span class="c"&gt;&amp;lt;!-- MongoDB Java Driver --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.mongodb&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;mongodb-driver-sync&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;5.5.1&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

        &lt;span class="c"&gt;&amp;lt;!-- OpenCSV for parsing the IMDB dataset --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.opencsv&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;opencsv&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;5.8&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

        &lt;span class="c"&gt;&amp;lt;!-- Jackson for parsing JSON responses from Watchmode --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.fasterxml.jackson.core&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;jackson-databind&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;2.17.2&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependencies&amp;gt;&lt;/span&gt;

    &lt;span class="nt"&gt;&amp;lt;dependencyManagement&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependencies&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
                &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;dev.langchain4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
                &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;langchain4j-bom&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
                &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.0-beta11&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
                &lt;span class="nt"&gt;&amp;lt;type&amp;gt;&lt;/span&gt;pom&lt;span class="nt"&gt;&amp;lt;/type&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependencies&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependencyManagement&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/project&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These dependencies give us everything we need: the LangChain4j agentic framework, MongoDB and OpenAI integrations, embedding support via Voyage AI, and utilities for parsing CSV and JSON data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to MongoDB
&lt;/h2&gt;

&lt;p&gt;Now, let's create the main application class that will handle connecting to MongoDB, loading our movie data, and setting up our agent.&lt;/p&gt;

&lt;p&gt;Create a file at &lt;code&gt;MovieAgentApp.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.movieagent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.voyageai.VoyageAiEmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.mongodb.IndexMapping&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.mongodb.MongoDbEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.HashSet&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MovieAgentApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;databaseName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"movie_search"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;collectionName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"movies"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;indexName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"vector_index"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;embeddingApiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"VOYAGE_AI_KEY"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;mongodbUri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"MONGODB_URI"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;watchmodeKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"WATCHMODE_KEY"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;openAiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OPENAI_KEY"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoClients&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mongodbUri&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;VoyageAiEmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VoyageAiEmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingApiKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"voyage-3"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sets up our connection to MongoDB using the connection string from our environment variables. We're also instantiating the Voyage AI embedding model, which we'll use to convert movie plot descriptions into vector embeddings.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;voyage-3&lt;/code&gt; model generates 1024-dimensional embeddings, which we'll need to specify when creating our vector search index. To learn more about Voyage AI's models, check out their &lt;a href="https://blog.voyageai.com/2024/09/18/voyage-3/" rel="noopener noreferrer"&gt;blog post about voyage-3&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, we will be using the Voyage AI model for generating our embeddings, and the separate OpenAI model for planning and orchestrating our AI agent. If you prefer, you can use OpenAI for both the embedding and planning. Just swap out your &lt;code&gt;VoyageAiEmbeddingModel&lt;/code&gt; for the &lt;a href="https://docs.langchain4j.dev/integrations/embedding-models/open-ai" rel="noopener noreferrer"&gt;&lt;code&gt;OpenAiEmbeddingModel&lt;/code&gt;&lt;/a&gt;. This does not go both ways, as Voyage AI is not supported as a chat model in LangChain4j.&lt;/p&gt;

&lt;p&gt;Next, we'll configure our embedding store and automatically create a vector search index if one doesn't already exist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;        &lt;span class="nc"&gt;IndexMapping&lt;/span&gt; &lt;span class="n"&gt;indexMapping&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;IndexMapping&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dimension&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dimension&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;metadataFieldNames&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HashSet&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;())&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="nc"&gt;MongoDbEmbeddingStore&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoDbEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;databaseName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;databaseName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkIndexExists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;indexMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexMapping&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromClient&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkDataExists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;loadDataFromCSV&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's break down what's happening here:&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;IndexMapping&lt;/code&gt; tells LangChain4j how to configure the vector search index. We're setting the dimension to match our embedding model (1024 for &lt;code&gt;voyage-3&lt;/code&gt;) and leaving &lt;code&gt;metadataFieldNames&lt;/code&gt; empty since we don't need to filter on metadata fields for this example.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;MongoDbEmbeddingStore&lt;/code&gt; builder does several things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Points to our &lt;code&gt;movie_search.movies&lt;/code&gt; collection
&lt;/li&gt;
&lt;li&gt;Checks if an index already exists using our helper method &lt;code&gt;checkIndexExists()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If no index exists, automatically creates one with the name &lt;code&gt;vector_index&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Uses our &lt;code&gt;IndexMapping&lt;/code&gt; to define the index structure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The resulting vector search index looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vector"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"numDimensions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"similarity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cosine"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we check if the collection already has data using &lt;code&gt;checkDataExists()&lt;/code&gt;, and if it's empty, we load and embed the movie dataset from our CSV file.&lt;/p&gt;

&lt;p&gt;Now, let's add those helper methods at the bottom of the class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;loadDataFromCSV&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;MongoDbEmbeddingStore&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;VoyageAiEmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Loading data..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;MovieEmbeddingService&lt;/span&gt; &lt;span class="n"&gt;embeddingService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;MovieEmbeddingService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;embeddingService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ingestMoviesFromCsv&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Movie data loaded successfully!"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Waiting 5 seconds for indexing to complete..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;checkDataExists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDatabase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;databaseName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;find&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;first&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;checkIndexExists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDatabase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;databaseName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoCursor&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;indexes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;listIndexes&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;iterator&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasNext&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;indexes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;next&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;checkIndexExists()&lt;/code&gt; method iterates through all indexes on the collection and returns &lt;code&gt;false&lt;/code&gt; if it finds one named &lt;code&gt;vector_index&lt;/code&gt;, telling LangChain4j to skip index creation. The &lt;code&gt;checkDataExists()&lt;/code&gt; method simply checks if the collection is empty.&lt;/p&gt;

&lt;p&gt;The five-second sleep after loading data gives MongoDB Atlas time to build the vector search index. Atlas indexing is eventually consistent, so this wait ensures our index is queryable before we start searching. In production, you'd want more robust index readiness checking, but for a tutorial, a brief sleep does the job.&lt;/p&gt;

&lt;h2&gt;
  
  
  Importing our data
&lt;/h2&gt;

&lt;p&gt;Now, we need to actually load the IMDB dataset, parse it, and convert the plot descriptions into vector embeddings. We'll create two classes: &lt;code&gt;Movie&lt;/code&gt; to represent each row in the CSV, and &lt;code&gt;MovieEmbeddingService&lt;/code&gt; to handle the embedding and storage logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Movie model
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;Movie.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.movieagent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.opencsv.bean.CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Movie&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;posterLink&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;certificate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;imdbRating&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;overview&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;metaScore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star2&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star3&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star4&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;numberOfVotes&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@CsvBindByPosition&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;gross&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;Movie&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getPosterLink&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;posterLink&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setPosterLink&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;posterLink&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;posterLink&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;posterLink&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getYear&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setYear&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getCertificate&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;certificate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCertificate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;certificate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;certificate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;certificate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getRuntime&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setRuntime&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;runtime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getGenre&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setGenre&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;genre&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getImdbRating&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;imdbRating&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setImdbRating&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;imdbRating&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;imdbRating&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;imdbRating&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getOverview&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;overview&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setOverview&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;overview&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;overview&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;overview&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getMetaScore&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metaScore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setMetaScore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;metaScore&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;metaScore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metaScore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getDirector&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setDirector&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;director&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getStar1&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;star1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setStar1&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;star1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;star1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getStar2&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;star2&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setStar2&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star2&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;star2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;star2&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getStar3&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;star3&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setStar3&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;star3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;star3&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getStar4&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;star4&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setStar4&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;star4&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;star4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;star4&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getNumberOfVotes&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;numberOfVotes&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setNumberOfVotes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;numberOfVotes&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;numberOfVotes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;numberOfVotes&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getGross&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;gross&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setGross&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;gross&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;gross&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gross&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@Override&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%s (%s) - %s - Rating: %s\nGenre: %s | Director: %s\n%s"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;imdbRating&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;director&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;overview&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;@CsvBindByPosition&lt;/code&gt; annotations from OpenCSV map each CSV column to a field in our &lt;code&gt;Movie&lt;/code&gt; class. The most important field for our purposes is &lt;code&gt;overview&lt;/code&gt;, which contains the plot description we'll embed for semantic search.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating embeddings for vector search
&lt;/h3&gt;

&lt;p&gt;Now, let's create the service that reads the CSV, generates embeddings, and stores them in MongoDB. Create &lt;code&gt;MovieEmbeddingService.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.movieagent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.opencsv.bean.CsvToBeanBuilder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.embedding.Embedding&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.document.Metadata&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.segment.TextSegment&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.embedding.EmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.EmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.io.InputStream&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.io.InputStreamReader&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MovieEmbeddingService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;MovieEmbeddingService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;EmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
                                  &lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;ingestMoviesFromCsv&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;InputStream&lt;/span&gt; &lt;span class="n"&gt;inputStream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;getClass&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getClassLoader&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getResourceAsStream&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"imdb_top_1000.csv"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputStream&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"imdb_top_1000.csv not found"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Movie&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;movies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;CsvToBeanBuilder&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Movie&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;InputStreamReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputStream&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;withType&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parse&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Processing "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;movies&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" movies..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Movie&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;movies&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getOverview&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;

                &lt;span class="nc"&gt;Metadata&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;getMetadata&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="nc"&gt;TextSegment&lt;/span&gt; &lt;span class="n"&gt;segment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getOverview&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="nc"&gt;Embedding&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

                &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Stored: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error processing CSV: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;Metadata&lt;/span&gt; &lt;span class="nf"&gt;getMetadata&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Movie&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;metadataMap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
        &lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"year"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getYear&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"genre"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getGenre&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"director"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDirector&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"imdbRating"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getImdbRating&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"star1"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getStar1&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"star2"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getStar2&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Metadata&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadataMap&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's walk through what's happening here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CSV parsing&lt;/strong&gt;: We use OpenCSV's &lt;code&gt;CsvToBeanBuilder&lt;/code&gt; to read &lt;code&gt;imdb_top_1000.csv&lt;/code&gt; from our resources directory and parse it into a list of &lt;code&gt;Movie&lt;/code&gt; objects.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Embedding generation&lt;/strong&gt;: For each movie, we:  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Extract the &lt;code&gt;overview&lt;/code&gt; (plot description).
&lt;/li&gt;
&lt;li&gt;Create a &lt;code&gt;TextSegment&lt;/code&gt; containing the overview text and metadata.
&lt;/li&gt;
&lt;li&gt;Pass the segment to &lt;code&gt;embeddingModel.embed()&lt;/code&gt;(our Voyage AI model) to generate a 1024-dimensional vector.
&lt;/li&gt;
&lt;li&gt;Store both the embedding and the segment (with its metadata) in MongoDB.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Metadata storage&lt;/strong&gt;: We attach key movie information as metadata (title, year, genre, etc.). This gets stored alongside the embedding in MongoDB, so when we search, we get back not just vectors but all the movie details we need.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The beauty of this approach is that we're only embedding the &lt;code&gt;overview&lt;/code&gt; field. When a user searches for "rebels fighting an empire," the semantic search matches that against movie plot descriptions, not titles or actor names. Users can describe what they remember about a movie without needing to know its exact name.&lt;/p&gt;

&lt;p&gt;If you want to learn more about how to use MongoDB with LangChain4j, check out our &lt;a href="https://www.mongodb.com/docs/atlas/ai-integrations/langchain4j/?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=langchain4j+agent&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating our agent
&lt;/h2&gt;

&lt;p&gt;Now comes the fun part: building the agentic system. We'll create three specialized tools that our supervisor agent can use, then wire them together.&lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding the architecture
&lt;/h3&gt;

&lt;p&gt;Our agent follows the &lt;a href="https://docs.langchain4j.dev/tutorials/agents/#supervisor-design-and-customization" rel="noopener noreferrer"&gt;supervisor pattern&lt;/a&gt;, where a planning model (powered by GPT-4o-mini) decides which tools to call and in what order. Here's the flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;User query&lt;/strong&gt;: "Find me a sci-fi movie about rebels fighting an empire and tell me where to stream it in GB"
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supervisor analyzes the query&lt;/strong&gt; and generates a plan
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool 1 (MongoDB search)&lt;/strong&gt;: Searches for movies matching the plot description
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool 2 (Watchmode search)&lt;/strong&gt;: Gets the Watchmode ID for that movie
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool 3 (Watchmode sources)&lt;/strong&gt;: Fetches streaming availability for that ID and region
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supervisor synthesizes&lt;/strong&gt; all the results into a clean, human-readable response&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The supervisor doesn't hardcode this workflow—it figures out the steps based on the context we provide it. This is the key difference between a traditional workflow and an agentic system: The LLM is doing the orchestration.&lt;/p&gt;

&lt;p&gt;Note: We use a different model for embedding and for our planning model because different models have different strengths. We could easily use ChatGPT for both embedding and chat/planning, but Voyage AI is not supported as a chat model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Defining tools
&lt;/h3&gt;

&lt;p&gt;In LangChain4j's agentic module, a &lt;strong&gt;tool&lt;/strong&gt; is just a Java method annotated with &lt;code&gt;@Agent&lt;/code&gt;. The annotation tells the supervisor what the tool does, and the framework handles the rest—automatically converting method parameters to tool inputs and outputs to shared state.&lt;/p&gt;

&lt;p&gt;Let's create our three tools.&lt;/p&gt;

&lt;h4&gt;
  
  
  Tool 1: MongoDB Movie Search
&lt;/h4&gt;

&lt;p&gt;Create &lt;code&gt;MongoDbMovieSearchTool.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.movieagent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.agentic.Agent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.embedding.Embedding&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.segment.TextSegment&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.embedding.EmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.service.V&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.EmbeddingSearchRequest&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.EmbeddingSearchResult&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.mongodb.MongoDbEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MongoDbMovieSearchTool&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoDbEmbeddingStore&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;MongoDbMovieSearchTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
                                   &lt;span class="nc"&gt;MongoDbEmbeddingStore&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Search movies in MongoDB by semantic description"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;outputName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"movieTitle"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@V&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Embedding&lt;/span&gt; &lt;span class="n"&gt;queryEmbedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

            &lt;span class="nc"&gt;EmbeddingSearchResult&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="nc"&gt;EmbeddingSearchRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;queryEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queryEmbedding&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxResults&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matches&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;TextSegment&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matches&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getFirst&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;embedded&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"No matching movie found in MongoDB."&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Error searching MongoDB: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;@Agent&lt;/code&gt; annotation does two important things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;value&lt;/code&gt;&lt;/strong&gt;: Describes what this tool does. The supervisor uses this description to decide when to call it.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;outputName&lt;/code&gt;&lt;/strong&gt;: Specifies that the result should be stored in a shared variable called &lt;code&gt;"movieTitle"&lt;/code&gt;, which other tools can access.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;code&gt;@V("query")&lt;/code&gt; annotation on the parameter tells LangChain4j to look for a variable named &lt;code&gt;"query"&lt;/code&gt; in the shared state (called the &lt;code&gt;AgenticScope&lt;/code&gt;) and pass it to this method.&lt;/p&gt;

&lt;p&gt;When the supervisor calls this tool with a plot description like "rebels fighting an empire," we:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Convert the query to an embedding.
&lt;/li&gt;
&lt;li&gt;Perform a vector search in MongoDB to find the most similar movie plot.
&lt;/li&gt;
&lt;li&gt;Return the movie title, which gets stored as &lt;code&gt;"movieTitle"&lt;/code&gt; in the shared state.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Tool 2: Watchmode Search
&lt;/h4&gt;

&lt;p&gt;Create &lt;code&gt;WatchmodeSearchTool.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.movieagent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.fasterxml.jackson.databind.JsonNode&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.fasterxml.jackson.databind.ObjectMapper&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.agentic.Agent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.service.V&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.URI&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.URLEncoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpClient&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpRequest&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpResponse&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.nio.charset.StandardCharsets&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WatchmodeSearchTool&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newHttpClient&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;WatchmodeSearchTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Find Watchmode ID for a given movie title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
           &lt;span class="n"&gt;outputName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"watchmodeId"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getWatchmodeId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@V&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="s"&gt;"https://api.watchmode.com/v1/search/?apiKey=%s&amp;amp;search_field=name&amp;amp;search_value=%s&amp;amp;types=movie"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;URLEncoder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;encode&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;StandardCharsets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;UTF_8&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;HttpRequest&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newBuilder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GET&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

            &lt;span class="nc"&gt;JsonNode&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;readTree&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="nc"&gt;JsonNode&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title_results"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"id"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asInt&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Watchmode ID: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;valueOf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"No Watchmode ID found for "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Error retrieving ID: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tool takes a movie title (like "Star Wars") and queries the Watchmode API to get its internal ID. That ID is then stored as &lt;code&gt;"watchmodeId"&lt;/code&gt; in the shared state, ready for the next tool to use.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://api.watchmode.com/" rel="noopener noreferrer"&gt;Watchmode's API documentation&lt;/a&gt; has more details on what is available with the API, but the key thing to know is that their &lt;code&gt;/search/&lt;/code&gt; endpoint returns a list of potential matches, and we're grabbing the first one.&lt;/p&gt;

&lt;h4&gt;
  
  
  Tool 3: Watchmode Sources
&lt;/h4&gt;

&lt;p&gt;Create &lt;code&gt;WatchmodeSourcesTool.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.movieagent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.agentic.Agent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.service.V&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.URI&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpClient&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpRequest&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpResponse&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WatchmodeSourcesTool&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newHttpClient&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;WatchmodeSourcesTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Agent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Get streaming availability for a Watchmode ID"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;outputName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"streamingSources"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getSources&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@V&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"watchmodeId"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
                             &lt;span class="nd"&gt;@V&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"region"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="s"&gt;"https://api.watchmode.com/v1/title/%s/sources/?apiKey=%s&amp;amp;regions=%s"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;HttpRequest&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newBuilder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GET&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Error retrieving sources: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tool takes the &lt;code&gt;watchmodeId&lt;/code&gt; from the previous step and a region code (like "GB" for Great Britain) and fetches the streaming availability. It returns the raw JSON response, which the supervisor will interpret to give the user a clean answer.&lt;/p&gt;

&lt;p&gt;Notice how this tool expects two inputs: &lt;code&gt;watchmodeId&lt;/code&gt; (from the previous tool) and &lt;code&gt;region&lt;/code&gt; (from the original user query). The supervisor figures out which variables to pass to each tool based on their parameter names.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Supervisor
&lt;/h3&gt;

&lt;p&gt;Now, let's wire everything together. First, create a simple interface for the supervisor:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;MovieSupervisor.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.movieagent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.agentic.Agent&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.service.V&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;MovieSupervisor&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Agent&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@V&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"request"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This interface defines the entry point for our agent. Users will call &lt;code&gt;invoke()&lt;/code&gt; with a natural language query, and the supervisor will orchestrate the tools to produce an answer.&lt;/p&gt;

&lt;p&gt;Back in our &lt;code&gt;MovieAgentApp.java&lt;/code&gt;, let's instantiate the planning model and build the supervisor (add this after the embedding store setup):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ChatModel&lt;/span&gt; &lt;span class="n"&gt;planningModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAiChatModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;openAiKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

&lt;span class="nc"&gt;MongoDbMovieSearchTool&lt;/span&gt; &lt;span class="n"&gt;mongoSearch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;MongoDbMovieSearchTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;WatchmodeSearchTool&lt;/span&gt; &lt;span class="n"&gt;watchmodeSearch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WatchmodeSearchTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;watchmodeKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;WatchmodeSourcesTool&lt;/span&gt; &lt;span class="n"&gt;watchmodeSources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WatchmodeSourcesTool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;watchmodeKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="nc"&gt;MovieSupervisor&lt;/span&gt; &lt;span class="n"&gt;supervisor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgenticServices&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;supervisorBuilder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MovieSupervisor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;subAgents&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mongoSearch&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;watchmodeSearch&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;watchmodeSources&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;supervisorContext&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""
            You are a movie assistant.
            1. If the user gives a plot/description, call MongoDbMovieSearchTool to find the movie.
            2. Use WatchmodeSearchTool with the title to get a Watchmode ID.
            3. Use WatchmodeSourcesTool with that ID and the user's region to get streaming info.
            4. Return a clean human-readable response.
            """&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;planningModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;responseStrategy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SupervisorResponseStrategy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SUMMARY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's break down what's happening in this supervisor configuration:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The planning model&lt;/strong&gt;: We're using &lt;code&gt;gpt-4o-mini&lt;/code&gt; as our planning model. This is the "brain" that decides which tools to call and when. It's cheaper and faster than GPT-4, but still capable enough to handle multi-step reasoning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sub-agents&lt;/strong&gt;: We register our three tools with the supervisor. Each tool's &lt;code&gt;@Agent&lt;/code&gt; annotation provides a description that the planning model uses to decide when to invoke it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Supervisor context&lt;/strong&gt;: This is where things get interesting. The context is essentially a set of instructions for the planning model, telling it how to orchestrate the tools. It's similar to a system prompt in a regular chat application, but here it's guiding &lt;em&gt;planning&lt;/em&gt; rather than just responding.&lt;/p&gt;

&lt;p&gt;Think of the supervisor context as a recipe: "If the user describes a plot, search MongoDB first. Then use that title to get a Watchmode ID. Finally, fetch streaming sources." The LLM follows these steps dynamically, adapting based on what it finds. We can see how this would allow the planner to orchestrate things if the user didn't want to get where everything streams (only do steps 1, 2, and 4) or already had the movie title (skip step 1).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Response strategy&lt;/strong&gt;: We're using &lt;code&gt;SupervisorResponseStrategy.SUMMARY&lt;/code&gt;, which tells the supervisor to return a synthesized summary of all the tool calls rather than just the last tool's raw output. This means instead of getting back raw JSON from Watchmode, we get a clean answer like "Star Wars is available on Disney+ in GB."&lt;/p&gt;

&lt;p&gt;The alternative strategies are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;LAST&lt;/code&gt;: Returns the output of the last tool called (useful when the final tool produces the answer).
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SCORED&lt;/code&gt;: Uses another LLM call to score both the summary and the last tool output, returning whichever is better.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For this use case, &lt;code&gt;SUMMARY&lt;/code&gt; makes sense because we want the agent to interpret the streaming data and present it nicely to the user.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing our movie recommendations
&lt;/h2&gt;

&lt;p&gt;Now, let's put it all together and test our agent. Add this to the end of your &lt;code&gt;main()&lt;/code&gt; method in &lt;code&gt;MovieAgentApp.java&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Find me a sci-fi movie about rebels fighting an empire in space and tell me where to stream it in GB."&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;supervisor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;invoke&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Agent Response:\n"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run your application with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$MONGODB_URI&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nv"&gt;VOYAGE_AI_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$VOYAGE_AI_KEY&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nv"&gt;WATCHMODE_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$WATCHMODE_KEY&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$OPENAI_API_KEY&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
mvn compile &lt;span class="nb"&gt;exec&lt;/span&gt;:java &lt;span class="nt"&gt;-Dexec&lt;/span&gt;.mainClass&lt;span class="o"&gt;=&lt;/span&gt;com.mongodb.movieagent.MovieAgentApp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On first run, you'll see output like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Loading data...
Processing 1000 movies...
Stored: The Shawshank Redemption
Stored: The Godfather
...
Movie data loaded successfully!
Waiting 5 seconds for indexing to complete...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the agent will process your query. You'll see trace output showing which tools are being called:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Watchmode ID: 12345
Agent Response:
Star Wars: Episode IV - A New Hope is available to stream on Disney+ in Great Britain. 
You can also rent it on Amazon Prime Video for £3.49.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What just happened?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The supervisor analyzed your query and identified that you were describing a plot.
&lt;/li&gt;
&lt;li&gt;It called &lt;code&gt;MongoDbMovieSearchTool&lt;/code&gt;, embedding "rebels fighting an empire in space" and searching MongoDB.
&lt;/li&gt;
&lt;li&gt;MongoDB returned "Star Wars: Episode IV - A New Hope" as the best match.
&lt;/li&gt;
&lt;li&gt;The supervisor then called &lt;code&gt;WatchmodeSearchTool&lt;/code&gt; with that title to get the Watchmode ID.
&lt;/li&gt;
&lt;li&gt;Finally, it called &lt;code&gt;WatchmodeSourcesTool&lt;/code&gt; with the ID and region "GB."
&lt;/li&gt;
&lt;li&gt;The planning model synthesized all three tool outputs into a clean, conversational response.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Try other queries to see how the agent handles different scenarios:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Search by genre and mood&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"I want a dark crime thriller about an undercover cop, where can I watch it in the US?"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Search by vague description&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"That movie with the kid who sees dead people, streaming options in US?"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Mix of specifics&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Christopher Nolan movie about dreams within dreams, available in GB?"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The beauty of this approach is that you never hardcoded the workflow. The supervisor figures out the steps based on your context instructions. If you wanted to add a fourth tool (say, fetching movie reviews), you'd just add it to the &lt;code&gt;subAgents()&lt;/code&gt; list and update the supervisor context—no need to rewrite orchestration logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding the AgenticScope
&lt;/h3&gt;

&lt;p&gt;Behind the scenes, LangChain4j manages something called the &lt;a href="https://docs.langchain4j.dev/tutorials/agents/#introducing-the-agenticscope" rel="noopener noreferrer"&gt;&lt;code&gt;AgenticScope&lt;/code&gt;&lt;/a&gt;, which is essentially a shared key-value store for the conversation. When &lt;code&gt;MongoDbMovieSearchTool&lt;/code&gt; returns a title, it's stored as &lt;code&gt;"movieTitle"&lt;/code&gt; in this scope. When &lt;code&gt;WatchmodeSearchTool&lt;/code&gt; needs a title, it reads from &lt;code&gt;"movieTitle"&lt;/code&gt;. The supervisor handles all this variable passing automatically based on the &lt;code&gt;@V&lt;/code&gt; annotations you defined on each tool's parameters.&lt;/p&gt;

&lt;p&gt;If you wanted to inspect or debug what's in the scope at any point, you could implement the &lt;code&gt;AgenticScopeAccess&lt;/code&gt; interface on your supervisor, but for this tutorial, the automatic variable passing is all you need.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;We've built a genuine AI agent that can understand natural language movie queries, search semantically through plot descriptions, and fetch real-time streaming availability—all without hardcoding a single workflow step.&lt;/p&gt;

&lt;p&gt;By combining MongoDB Atlas for vector search with LangChain4j's agentic framework, we created a system that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understands vague descriptions like "rebels fighting an empire" and maps them to actual movies.
&lt;/li&gt;
&lt;li&gt;Autonomously decides which tools to call and in what order.
&lt;/li&gt;
&lt;li&gt;Handles multi-step reasoning (search → get ID → fetch sources).
&lt;/li&gt;
&lt;li&gt;Returns clean, conversational responses instead of raw API data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The supervisor pattern we used here is just one of many agentic architectures. You could extend this with additional tools (fetching reviews, checking rental prices, filtering by age rating), add memory so the agent remembers past conversations, or even create tool-specific agents that the supervisor coordinates.&lt;/p&gt;

&lt;p&gt;If you're curious about building more complex agentic systems, check out the &lt;a href="https://docs.langchain4j.dev/tutorials/agents" rel="noopener noreferrer"&gt;LangChain4j agentic documentation&lt;/a&gt; and explore patterns like loops, conditionals, and human-in-the-loop workflows.&lt;/p&gt;

&lt;p&gt;For more MongoDB and AI tutorials, see my other articles on &lt;a href="https://dev.to/mongodb/how-to-build-rag-applications-with-spring-ai-and-mongodb-5gaj"&gt;Building RAG Applications with Spring AI and MongoDB&lt;/a&gt; or dive into the &lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=langchain4j+agent&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas Vector Search documentation&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>ai</category>
      <category>agents</category>
      <category>java</category>
    </item>
    <item>
      <title>Secure Your Spring API With JWT and MongoDB</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Thu, 18 Sep 2025 16:29:05 +0000</pubDate>
      <link>https://forem.com/mongodb/secure-your-spring-api-with-jwt-and-mongodb-1gjj</link>
      <guid>https://forem.com/mongodb/secure-your-spring-api-with-jwt-and-mongodb-1gjj</guid>
      <description>&lt;p&gt;APIs don’t need to remember who you are if you bring your proof every time. That’s the idea behind &lt;a href="https://www.jwt.io/introduction#what-is-json-web-token" rel="noopener noreferrer"&gt;JSON Web Token (JWT)&lt;/a&gt;—a compact, signed token that carries just enough claims about the user to authorize a request. The server signs it, the client sends it in the &lt;code&gt;Authorization&lt;/code&gt; header, and any service with the public key can verify it. It fits SPAs, mobile apps, and microservices, scales cleanly, and makes rotation and revocation straightforward once you add refresh tokens or an auth server.&lt;/p&gt;

&lt;p&gt;We’re going to build a small, secure API with &lt;a href="https://spring.io/projects/spring-security" rel="noopener noreferrer"&gt;Spring Security&lt;/a&gt; and store user data in MongoDB. Spring Security already knows how to handle JWTs via the OAuth2 Resource Server support, so we’ll lean on that instead of writing custom filters.&lt;/p&gt;

&lt;p&gt;The flow is simple: Register a user, log in to get a signed JWT, and use that token to call protected endpoints. This is a starter, enough to get a real REST API locked down with stateless auth, and you can layer on refresh tokens, key rotation, and an external authorization server later. The code is all available on &lt;a href="https://github.com/mongodb-developer/spring-jwt" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Note: While we are working on security at the application level, you can learn more about security at the database level at &lt;a href="https://learn.mongodb.com/courses/mongodb-atlas-security?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=spring+jwt&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB University&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;a href="https://www.mongodb.com/docs/guides/?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=spring+jwt&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas account with a cluster set up&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;A free M0 cluster is perfect for this
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://www.java.com/en/download/manual.jsp" rel="noopener noreferrer"&gt;Java&lt;/a&gt; 17+
&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://maven.apache.org/download.cgi" rel="noopener noreferrer"&gt;Maven&lt;/a&gt; (I use version 3.9.10)&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Dependencies (Spring Initializr)
&lt;/h2&gt;

&lt;p&gt;Our first step is creating the application. The easiest way is with &lt;a href="https://start.spring.io/" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt;: Pick Maven, Java 21, Jar.&lt;/p&gt;

&lt;p&gt;Add these starters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spring Web
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://spring.io/projects/spring-data-mongodb" rel="noopener noreferrer"&gt;Spring Data MongoDB&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;OAuth2 Resource Server (this brings in Spring Security and the Nimbus JWT pieces)
&lt;/li&gt;
&lt;li&gt;Validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uskhbpy2ef0d0f8qe80.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uskhbpy2ef0d0f8qe80.png" alt="Spring Initializr config" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;pom.xml&lt;/code&gt; will include:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependencies&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-web&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-data-mongodb&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-oauth2-resource-server&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-validation&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dependencies&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s all we need: Web for the controllers, MongoDB for persistence, the resource server starter for JWT verification and the security stack, and Validation for request DTOs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuring MongoDB to store user information
&lt;/h2&gt;

&lt;p&gt;Download the project from Initializr and open it in your IDE. Start by pointing Spring at MongoDB. In &lt;code&gt;application.properties&lt;/code&gt;, use your MongoDB Atlas connection string and name the database &lt;code&gt;users&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.data.mongodb.uri=${MONGODB_URI}  
spring.data.mongodb.database=users

spring.data.mongodb.auto-index-creation=true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're also telling MongoDB that we can create indexes in the app if we use them and they do not yet exist. We will annotate these using the &lt;code&gt;@index&lt;/code&gt; notation later. This will be useful when we want to create a unique index for our usernames to prevent duplicates.&lt;/p&gt;

&lt;p&gt;We’re going to store user records in MongoDB. Create a &lt;code&gt;model&lt;/code&gt; package and add a &lt;code&gt;UserAccount&lt;/code&gt; document that represents what we’ll persist: a unique username, a BCrypt-hashed password, and a set of roles. We don't use the roles in this application, but this can be built upon to make some elements of the API only available to some user groups.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.annotation.Id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.index.Indexed&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Set&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"users"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserAccount&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nd"&gt;@Id&lt;/span&gt; &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@Indexed&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unique&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// BCrypt-hashed  &lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;roles&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// e.g. ["USER", "ADMIN"]  &lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getUsername&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;username&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getPassword&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setPassword&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;password&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getRoles&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;roles&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setRoles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;roles&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;roles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;roles&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;@Document("users")&lt;/code&gt; annotation tells Spring Data MongoDB to map this class to the &lt;code&gt;users&lt;/code&gt; collection and the &lt;code&gt;@Indexed(unique = true)&lt;/code&gt; enforces unique usernames at the database levels.&lt;/p&gt;

&lt;p&gt;Next up is a repository so the app can talk to the database. Create a &lt;code&gt;repository&lt;/code&gt; package and add &lt;code&gt;UserAccountRepository&lt;/code&gt;. Spring Data will generate the queries for us based on method names, so we can look users up by username and quickly check if one already exists.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.model.UserAccount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Optional&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;UserAccountRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;UserAccount&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;UserAccount&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;findByUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;existsByUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we need to teach Spring Security how to load users from MongoDB. We do that by implementing &lt;code&gt;UserDetailsService&lt;/code&gt; and adapting my &lt;code&gt;UserAccount&lt;/code&gt; document into Spring Security’s &lt;code&gt;User&lt;/code&gt; type.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.repository.UserAccountRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.core.userdetails.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MongoUserDetailsService&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;UserDetailsService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UserAccountRepository&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;MongoUserDetailsService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;UserAccountRepository&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@Override&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;UserDetails&lt;/span&gt; &lt;span class="nf"&gt;loadUserByUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;UsernameNotFoundException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;orElseThrow&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UsernameNotFoundException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"User not found"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;withUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getUsername&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getPassword&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;roles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRoles&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toArray&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]::&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On login, Spring calls &lt;code&gt;loadUserByUsername&lt;/code&gt;, we fetch the record, and if it exists, we hand back a &lt;code&gt;User&lt;/code&gt; with the stored BCrypt hash for the password and the roles assigned to that user. If not, we throw &lt;code&gt;UsernameNotFoundException&lt;/code&gt;. The &lt;code&gt;DaoAuthenticationProvider&lt;/code&gt; takes it from there and compares the raw password to the hash using the &lt;code&gt;PasswordEncoder&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;One thing to note: &lt;code&gt;.roles(...)&lt;/code&gt; expects bare role names like &lt;code&gt;USER&lt;/code&gt; and adds the &lt;code&gt;ROLE_&lt;/code&gt; prefix for us, so &lt;code&gt;Set.of("USER","ADMIN")&lt;/code&gt; in the database becomes authorities &lt;code&gt;ROLE_USER&lt;/code&gt; and &lt;code&gt;ROLE_ADMIN&lt;/code&gt; at runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  Our secure API
&lt;/h2&gt;

&lt;p&gt;We're adding a tiny controller with one endpoint to prove the security is actually doing something. It lives under &lt;code&gt;/api/secure&lt;/code&gt; and returns a greeting to whoever is authenticated. The &lt;code&gt;Principal&lt;/code&gt; comes from the &lt;code&gt;SecurityContext&lt;/code&gt; that Spring builds after a valid JWT is verified.&lt;/p&gt;

&lt;p&gt;Create a package called &lt;code&gt;controller&lt;/code&gt; and a new class &lt;code&gt;SecureController&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.controller&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.GetMapping&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.RequestMapping&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.RestController&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.security.Principal&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@RestController&lt;/span&gt;  
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/api/secure"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SecureController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/hello"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;hello&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Principal&lt;/span&gt; &lt;span class="n"&gt;principal&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Hello, %s"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;formatted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;principal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we hit &lt;code&gt;/api/secure/hello&lt;/code&gt; without a token, we’ll get a &lt;code&gt;401 Unauthorized&lt;/code&gt; because the resource server is in play and everything is locked down by default. When we have everything in place, we'll be able to log in to get a JWT, send it in &lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt;, and the same call responds with &lt;code&gt;Hello, &amp;lt;username&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security configuration
&lt;/h2&gt;

&lt;p&gt;Here’s how we will wire security in this project. Let’s create a &lt;code&gt;config&lt;/code&gt; package with a single &lt;code&gt;SecurityConfig&lt;/code&gt; that defines the filter chain, the JWT components, and the authentication manager we'll use for the JSON login flow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.config&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.nimbusds.jose.jwk.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.nimbusds.jose.jwk.source.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.nimbusds.jose.proc.SecurityContext&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.authentication.AuthenticationManager&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.config.Customizer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.config.annotation.authentication.configuration.AuthenticationConfiguration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.config.annotation.web.builders.HttpSecurity&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.config.annotation.web.configurers.AbstractHttpConfigurer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.config.http.SessionCreationPolicy&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.crypto.bcrypt.BCryptPasswordEncoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.crypto.password.PasswordEncoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.oauth2.jwt.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.web.SecurityFilterChain&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.security.interfaces.RSAPrivateKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.security.interfaces.RSAPublicKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Configuration&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SecurityConfig&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;SecurityFilterChain&lt;/span&gt; &lt;span class="nf"&gt;filterChain&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpSecurity&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;csrf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;AbstractHttpConfigurer:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;disable&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;authorizeHttpRequests&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt;  
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requestMatchers&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/token"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"/auth/register"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;permitAll&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;anyRequest&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;authenticated&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;oauth2ResourceServer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oauth2&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;oauth2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;jwt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Customizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;withDefaults&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sessionManagement&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sm&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;sm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sessionCreationPolicy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SessionCreationPolicy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STATELESS&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt; 

        &lt;span class="nd"&gt;@Bean&lt;/span&gt;  
    &lt;span class="nc"&gt;AuthenticationManager&lt;/span&gt; &lt;span class="nf"&gt;authenticationManager&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AuthenticationConfiguration&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAuthenticationManager&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're disabling CSRF in this tutorial because we've made this service stateless and use Bearer tokens, not cookies. If we were using server sessions, disabling CSRF would be a mistake; with stateless JWTs, it’s the right call.&lt;/p&gt;

&lt;p&gt;The authorization rules are simple on purpose. Everything is locked down by default, with only &lt;code&gt;/auth/register&lt;/code&gt; and &lt;code&gt;/auth/token&lt;/code&gt; open so a client can create an account and exchange credentials for a token. The session policy is stateless to make every request self-contained. No server memory of prior requests, no hidden state.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;oauth2ResourceServer&lt;/code&gt; piece is what actually validates tokens on protected requests. That one line adds the bearer token filter and hands it a &lt;code&gt;JwtDecoder&lt;/code&gt;, which we will implement soon. On each call, it pulls the token from the header, checks the signature and expiry, and only then lets the controller run. We're using self-signed JWTs here: a private key to issue them at the login endpoint, the public key to verify them on every other request. It’s simple and elegant for a single service or a starter project.&lt;/p&gt;

&lt;p&gt;We also expose an &lt;code&gt;AuthenticationManager&lt;/code&gt; so the &lt;code&gt;/auth/token&lt;/code&gt; endpoint can accept a JSON body and call &lt;code&gt;authenticate(...)&lt;/code&gt; to verify the username and password against MongoDB before minting a JWT. I chose a JSON contract for login because it’s easier to evolve, adding captcha or device ID info or whatever we need.&lt;/p&gt;

&lt;h3&gt;
  
  
  Storing passwords securely
&lt;/h3&gt;

&lt;p&gt;We're are going to add another bean to hash our passwords:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@Bean&lt;/span&gt; &lt;span class="nc"&gt;PasswordEncoder&lt;/span&gt; &lt;span class="nf"&gt;passwordEncoder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BCryptPasswordEncoder&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need a &lt;code&gt;PasswordEncoder&lt;/code&gt; (using &lt;a href="https://docs.spring.io/spring-security/reference/api/java/org/springframework/security/crypto/bcrypt/BCryptPasswordEncoder.html" rel="noopener noreferrer"&gt;&lt;code&gt;BCryptPasswordEncoder&lt;/code&gt;&lt;/a&gt;) because we want our user passwords to live in MongoDB as hashes, not plaintext.&lt;/p&gt;

&lt;p&gt;If we grow our application to include refresh tokens or multiple services, we would want to split responsibilities. Put login, refresh, and key rotation in a dedicated &lt;a href="https://spring.io/projects/spring-authorization-server" rel="noopener noreferrer"&gt;authorization server&lt;/a&gt; and expose a &lt;a href="https://datatracker.ietf.org/doc/html/rfc7517" rel="noopener noreferrer"&gt;JSON Web Key (JWK)&lt;/a&gt; endpoint; point our resource servers at that and keep them focused on validating tokens and serving data. Having one issuer we could reuse across different applications would mean a smaller attack surface.&lt;/p&gt;

&lt;p&gt;We need a JWT decoder for our resource server, but before we do that, we need some RSA keys.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating RSA keys
&lt;/h2&gt;

&lt;p&gt;We can sign JWTs with a &lt;a href="https://auth0.com/blog/rs256-vs-hs256-whats-the-difference/" rel="noopener noreferrer"&gt;shared secret (HS256) or with a keypair (RS256)&lt;/a&gt;. We're using asymmetric keys so the app signs with a private key and verifies with the public key, which plays nicer with multiple services and future key rotation.&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;src/main/resources/certs&lt;/code&gt;, then generate keys with OpenSSL by running the following commands in that new folder location:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl genrsa &lt;span class="nt"&gt;-out&lt;/span&gt; keypair.pem 2048
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will generate with an encryption level of 2048, which is fine for this demo. Do bear in mind that this is not enough for long-term security, and &lt;a href="https://community.rsa.com/s/article/RSA-cryptography-and-NIST-guidance#:~:text=RSA%20is%20aware%20of%20the,stance%20based%20on%20this%20guidance." rel="noopener noreferrer"&gt;RSA 2028 is marked for depreciation by NIST by 2030&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This will generate the pair of keys that we can now extract the public key from, and place in a &lt;code&gt;public.pem&lt;/code&gt; file by running the following command in the same folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl rsa &lt;span class="nt"&gt;-in&lt;/span&gt; keypair.pem &lt;span class="nt"&gt;-pubout&lt;/span&gt; &lt;span class="nt"&gt;-out&lt;/span&gt; public.pem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we ran our code from here, we would get an error that the private key is in the wrong format because &lt;code&gt;openssl genrsa&lt;/code&gt; writes a legacy &lt;a href="https://www.ibm.com/docs/en/zos/3.1.0?topic=cryptography-pkcs-1-formats" rel="noopener noreferrer"&gt;&lt;code&gt;PKCS#1&lt;/code&gt;&lt;/a&gt; private key, but Java/Spring/Nimbus expect a &lt;a href="https://en.wikipedia.org/wiki/PKCS_8" rel="noopener noreferrer"&gt;&lt;code&gt;PKCS#8&lt;/code&gt;&lt;/a&gt; key.&lt;/p&gt;

&lt;p&gt;To extract the private key in the format we need, we run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl pkcs8 &lt;span class="nt"&gt;-topk8&lt;/span&gt; &lt;span class="nt"&gt;-inform&lt;/span&gt; PEM &lt;span class="nt"&gt;-outform&lt;/span&gt; PEM &lt;span class="nt"&gt;-nocrypt&lt;/span&gt; &lt;span class="nt"&gt;-in&lt;/span&gt; keypair.pem &lt;span class="nt"&gt;-out&lt;/span&gt; private.pem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We should now have &lt;code&gt;public.pem&lt;/code&gt; and &lt;code&gt;private.pem&lt;/code&gt;, and we can safely delete the &lt;code&gt;keypair.pem&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;└── src/ 
   └── main/
     └── resources/certs/
         keypair.pem
         private.pem 
         public.pem  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need a way to access these keys in our application. Let's configure our application so we can bind them into the app, keep the files local in dev, and move the locations to environment variables in prod.&lt;/p&gt;

&lt;p&gt;Create a class &lt;code&gt;RsaKeyProperties&lt;/code&gt; in &lt;code&gt;config&lt;/code&gt; package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.config&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.context.properties.ConfigurationProperties&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.security.interfaces.RSAPrivateKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.security.interfaces.RSAPublicKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@ConfigurationProperties&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"rsa"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;RsaKeyProperties&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RSAPublicKey&lt;/span&gt; &lt;span class="n"&gt;publicKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;RSAPrivateKey&lt;/span&gt; &lt;span class="n"&gt;privateKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In our main &lt;code&gt;SpringJwtApplication&lt;/code&gt; class, we need to enable this by adding &lt;code&gt;@EnableConfigurationProperties(RsaKeyProperties.class)&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.config.RsaKeyProperties&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.SpringApplication&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.autoconfigure.SpringBootApplication&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.context.properties.EnableConfigurationProperties&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@SpringBootApplication&lt;/span&gt;  
&lt;span class="nd"&gt;@EnableConfigurationProperties&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RsaKeyProperties&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpringJwtApplication&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;SpringApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SpringJwtApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And in application properties:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;rsa&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;private&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nl"&gt;classpath:&lt;/span&gt;&lt;span class="n"&gt;certs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="kd"&gt;private&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;pem&lt;/span&gt;  
&lt;span class="n"&gt;rsa&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;public&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nl"&gt;classpath:&lt;/span&gt;&lt;span class="n"&gt;certs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="kd"&gt;public&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;pem&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we can access our RSA keys in our app, we can create our JWT encoders and decoders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Encoding and decoding JWT
&lt;/h2&gt;

&lt;p&gt;Back in &lt;code&gt;SecurityConfig&lt;/code&gt;, at the top of the class can, pull the keys in via the constructor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;RSAPublicKey&lt;/span&gt; &lt;span class="n"&gt;publicKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;RSAPrivateKey&lt;/span&gt; &lt;span class="n"&gt;privateKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;SecurityConfig&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RsaKeyProperties&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;publicKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;publicKey&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;privateKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;privateKey&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, define the JWT encoder and decoder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@Bean&lt;/span&gt;  
    &lt;span class="nc"&gt;JwtDecoder&lt;/span&gt; &lt;span class="nf"&gt;jwtDecoder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;NimbusJwtDecoder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;withPublicKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;publicKey&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;  
    &lt;span class="nc"&gt;JwtEncoder&lt;/span&gt; &lt;span class="nf"&gt;jwtEncoder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="no"&gt;JWK&lt;/span&gt; &lt;span class="n"&gt;jwk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RSAKey&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;publicKey&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;privateKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;privateKey&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="nc"&gt;JWKSource&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SecurityContext&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;jwks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ImmutableJWKSet&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JWKSet&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jwk&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;NimbusJwtEncoder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jwks&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll be using the &lt;code&gt;NimbusJwtDecoder&lt;/code&gt; and &lt;code&gt;NimbusJwtEncoder&lt;/code&gt;, that we have from our OAuth resource server.  &lt;code&gt;NimbusJwtDecoder&lt;/code&gt; verifies incoming tokens with the public key; &lt;code&gt;NimbusJwtEncoder&lt;/code&gt; signs new tokens with the private key. When a user logs in successfully, we’ll return a token produced by this encoder.&lt;/p&gt;

&lt;p&gt;Our next step is a small service that builds the claims and calls the encoder to mint that JWT.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating tokens
&lt;/h2&gt;

&lt;p&gt;I keep token creation in a small service so the controller doesn’t worry about JWT details. Create a package &lt;code&gt;service&lt;/code&gt; and create a &lt;code&gt;TokenService&lt;/code&gt; class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.core.Authentication&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.core.GrantedAuthority&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.oauth2.jwt.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.time.Instant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.time.temporal.ChronoUnit&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.stream.Collectors&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TokenService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;JwtEncoder&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TokenService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JwtEncoder&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; 
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; 
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Authentication&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Instant&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Instant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAuthorities&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;GrantedAuthority:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;getAuthority&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;collect&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Collectors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;joining&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;" "&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;claims&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;JwtClaimsSet&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;issuer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"self"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;issuedAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expiresAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;plus&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ChronoUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;HOURS&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;// short TTL is good  &lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;claim&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"scope"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;encode&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JwtEncoderParameters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="na"&gt;getTokenValue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Authentication&lt;/code&gt; that reaches this method represents a successfully logged-in user. We take their authorities and flatten them into a single space-delimited &lt;code&gt;scope&lt;/code&gt; string, then build a claim set that says who issued the token (self-assigned), when it was issued, when it expires (one hour), who it’s about (subject is our Principal.getName), and what scopes it carries.&lt;/p&gt;

&lt;p&gt;The encoder signs that claim set with the private key and hands us back a compact JWT string, which is what we return to the caller. A one-hour expiry keeps access tokens short-lived and easy to rotate later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authorization controller
&lt;/h2&gt;

&lt;p&gt;Time to add an auth API so clients can register and exchange credentials for a token. We keep this in a dedicated new class &lt;code&gt;AuthController&lt;/code&gt; in the &lt;code&gt;controller&lt;/code&gt; package under &lt;code&gt;/auth&lt;/code&gt;. It accepts JSON, checks credentials with the &lt;code&gt;AuthenticationManager&lt;/code&gt;, and mints a JWT with the &lt;code&gt;TokenService&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.controller&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.model.UserAccount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.repository.UserAccountRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.springjwt.service.TokenService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jakarta.validation.constraints.NotBlank&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.http.ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.authentication.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.security.crypto.password.PasswordEncoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Set&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@RestController&lt;/span&gt;  
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AuthController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UserAccountRepository&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;PasswordEncoder&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;AuthenticationManager&lt;/span&gt; &lt;span class="n"&gt;authManager&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;TokenService&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;AuthController&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;UserAccountRepository&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;PasswordEncoder&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                          &lt;span class="nc"&gt;AuthenticationManager&lt;/span&gt; &lt;span class="n"&gt;authManager&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TokenService&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;authManager&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;authManager&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;RegisterRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@NotBlank&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;@NotBlank&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;LoginRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@NotBlank&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;@NotBlank&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;JwtResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  

    &lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/register"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;?&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;register&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestBody&lt;/span&gt; &lt;span class="nd"&gt;@Valid&lt;/span&gt; &lt;span class="nc"&gt;RegisterRequest&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;existsByUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;badRequest&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Username taken"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UserAccount&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
        &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setPassword&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;encode&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;  
        &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setRoles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"USER"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
        &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Registered"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/token"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;JwtResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;token&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestBody&lt;/span&gt; &lt;span class="nd"&gt;@Valid&lt;/span&gt; &lt;span class="nc"&gt;LoginRequest&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;authManager&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;authenticate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UsernamePasswordAuthenticationToken&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JwtResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="o"&gt;)));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The request records keep things tight and make sure username and password aren’t empty. Registration hashes the password with BCrypt and seeds a basic &lt;code&gt;USER&lt;/code&gt; role before saving to MongoDB.&lt;/p&gt;

&lt;p&gt;The token endpoint runs the credential check through Spring Security; on success, it returns a signed JWT. Hit a secured route without that token and you’ll get a 401; include &lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt; and the same call goes through.&lt;/p&gt;

&lt;p&gt;It would be worth setting up an &lt;code&gt;@RestControllerAdvice&lt;/code&gt; to shape 400/401/403 responses, which makes DX nicer than plain strings, but we'll leave that out of this tutorial for simplicity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing with curl
&lt;/h2&gt;

&lt;p&gt;Start the app and use these calls from a terminal. I’m keeping them simple and in the exact order you’ll need.&lt;/p&gt;

&lt;p&gt;Register a user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/auth/register &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"username":"tim","password":"secret"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we can see it in our database:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7n48ve5gc3dqphtcmi83.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7n48ve5gc3dqphtcmi83.png" alt="Image of document in the database" width="800" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Log in and grab a token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/auth/token &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"username":"tim","password":"secret"}'&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s1"&gt;'s/.*"token":"([^"]+)".*/\1/'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Call the secured endpoint with the token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-i&lt;/span&gt; http://localhost:8080/api/secure/hello &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Try the same endpoint without a token to see the 401:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-i&lt;/span&gt; http://localhost:8080/api/secure/hello
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to sanity-check bad creds, this returns 401:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/auth/token &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"username":"tim","password":"wrong"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;We built a small but complete secure API: users in MongoDB, passwords hashed with BCrypt, a JSON login that mints short-lived RSA-signed JWTs, and protected routes enforced by Spring Security’s OAuth2 Resource Server. MongoDB fits well here because a user is just a simple document (username, hash, roles), unique indexes prevent duplicates, and you can easily extend the schema or add TTL-based refresh tokens later.&lt;/p&gt;

&lt;p&gt;Spring Security did the heavy lifting: &lt;code&gt;AuthenticationManager&lt;/code&gt; + your Mongo-backed &lt;code&gt;UserDetailsService&lt;/code&gt; verified credentials, &lt;code&gt;PasswordEncoder&lt;/code&gt; handled hashing, and the bearer token filter validated each JWT and built a &lt;code&gt;SecurityContext&lt;/code&gt; per request. It’s all stateless (no sessions to juggle) and ready for you to grow with issuer/audience checks, JWKS + rotation, and refresh tokens when you need them.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>spring</category>
      <category>jwt</category>
      <category>security</category>
    </item>
    <item>
      <title>Quantize Your Vectors, Speed Up Your Java AI Applications</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Mon, 08 Sep 2025 14:38:13 +0000</pubDate>
      <link>https://forem.com/mongodb/quantize-your-vectors-speed-up-your-java-ai-applications-bng</link>
      <guid>https://forem.com/mongodb/quantize-your-vectors-speed-up-your-java-ai-applications-bng</guid>
      <description>&lt;p&gt;Vector quantization is the process of shrinking full fidelity vectors into fewer bits. It reduces the memory required to store each vector by storing the reduced representation of our vector instead. This reduces our resource consumption and allows for a more efficient application, at the cost of recall. MongoDB recommends quantization for applications with large numbers of vectors, i.e., over 100,000.&lt;/p&gt;

&lt;p&gt;With MongoDB Atlas Vector Search, we can automatically quantize our float vector embeddings. It also supports ingesting and indexing pre-quantized scalar and binary vectors from &lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-quantization/#prerequisites?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=vector+quantization&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;certain embedding models&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;First, we'll take a look into how vector quantization works, and then we'll implement it using the MongoDB Java sync driver. If you want to check out the code for this tutorial, it is all available in &lt;a href="https://github.com/mongodb-developer/vector-quantization" rel="noopener noreferrer"&gt;this GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scalar quantization
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/docs/manual/reference/glossary/#std-term-scalar-quantization?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=vector+quantization&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Scalar quantization&lt;/a&gt; involves first identifying the minimum and maximum values for each dimension of the indexed vectors to establish a range of values for these dimensions. The range is then divided into equally sized bins. Each float is mapped into a bin to convert our previous float values into discrete integers. In MongoDB Atlas Vector Search, this scalar quantization reduces the vector embedding's RAM cost to almost 25% (&lt;code&gt;1/3.75&lt;/code&gt;) of the pre-quantization cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Binary quantization
&lt;/h2&gt;

&lt;p&gt;When an embedding is normalized to length &lt;code&gt;1&lt;/code&gt;, such as with OpenAI's &lt;code&gt;text-embedding-3-large&lt;/code&gt;, we can use binary quantization. Binary quantization involves assuming a midpoint of &lt;code&gt;0&lt;/code&gt; for each dimension in the embedding. For each value in the vector, we then assign a binary value of &lt;code&gt;1&lt;/code&gt; if the value is greater than the midpoint or &lt;code&gt;0&lt;/code&gt; if the value is less than or equal to the midpoint.&lt;/p&gt;

&lt;p&gt;In MongoDB Atlas Vector Search, this binary quantization reduces the vector embeddings RAM usage to just over 4% (&lt;code&gt;1/24&lt;/code&gt;) of the pre-quantization cost. The reason it's not &lt;code&gt;1/32&lt;/code&gt; is because the data structure containing the &lt;a href="https://arxiv.org/abs/1603.09320" rel="noopener noreferrer"&gt;Hierarchical Navigable Small Worlds&lt;/a&gt; graph itself, separate from the vector values, isn't compressed.&lt;/p&gt;

&lt;p&gt;When you run a query, Atlas Vector Search first converts the query’s float values into a binary vector, using the same midpoint as the index to enable efficient comparisons with stored binary vectors. After this initial match, it rescores the candidates by re-checking them against the original float values from the binary index to refine the ranking. The full fidelity vectors are kept in a separate on-disk data structure and are accessed only during rescoring when binary quantization is enabled, or when you perform exact search against binary or scalar quantized vectors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirements
&lt;/h2&gt;

&lt;p&gt;In order to automatically quantize vectors, or ingest quantized vectors, there are some requirements that need to be met. The following table from the MongoDB vector quantization docs lays out these requirements:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;For &lt;code&gt;int1&lt;/code&gt; Ingestion&lt;/th&gt;
&lt;th&gt;For &lt;code&gt;int8&lt;/code&gt; Ingestion&lt;/th&gt;
&lt;th&gt;For Automatic Scalar Quantization&lt;/th&gt;
&lt;th&gt;For Automatic Binary Quantization&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Requires index definition settings&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Requires BSON &lt;code&gt;binData&lt;/code&gt; format&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage on mongod&lt;/td&gt;
&lt;td&gt;&lt;code&gt;binData(int1)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;binData(int8)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;binData(float32)&lt;/code&gt; &lt;code&gt;array(double)\&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;binData(float32)&lt;/code&gt; &lt;code&gt;array(double)\&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supported similarity method&lt;/td&gt;
&lt;td&gt;&lt;code&gt;euclidean&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cosine&lt;/code&gt; &lt;code&gt;euclidean\&lt;/code&gt; &lt;code&gt;dotProduct\&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cosine&lt;/code&gt; &lt;code&gt;euclidean\&lt;/code&gt; &lt;code&gt;dotProduct\&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cosine&lt;/code&gt; &lt;code&gt;euclidean\&lt;/code&gt; &lt;code&gt;dotProduct\&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supported number of dimensions&lt;/td&gt;
&lt;td&gt;Multiple of 8&lt;/td&gt;
&lt;td&gt;1 to 8192&lt;/td&gt;
&lt;td&gt;1 to 8192&lt;/td&gt;
&lt;td&gt;1 to 8192&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supports ANN and ENN search&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Note: MongoDB Atlas stores all floating-point values as the &lt;code&gt;double&lt;/code&gt; data type internally. Therefore, both 32-bit and 64-bit embeddings are compatible with automatic quantization without conversion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Our Java application
&lt;/h2&gt;

&lt;p&gt;For this tutorial, we'll walk through setting up our Java application and MongoDB database to embed some sample data, and run a vector search query on it. We'll take two main approaches: automatic quantization of our vectors, and ingesting pre-quantized vectors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;a href="https://www.mongodb.com/docs/guides/?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=vector+quantization&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas account with a cluster set up&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;A free M0 cluster is perfect for this
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Java on your machine, set up and ready to go (I use &lt;a href="https://jdk.java.net/21/" rel="noopener noreferrer"&gt;Java version 21&lt;/a&gt;)
&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://maven.apache.org/download.cgi" rel="noopener noreferrer"&gt;Maven&lt;/a&gt; (I use version 3.9.10)
&lt;/li&gt;

&lt;li&gt;A &lt;a href="https://www.voyageai.com/" rel="noopener noreferrer"&gt;Voyage AI API&lt;/a&gt; token

&lt;ul&gt;
&lt;li&gt;Other embedding models work—just make sure it supports vector quantization. Check out a list of supported models in our &lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-quantization/#prerequisites?utm_campaign=devrel&amp;amp;utm_source=third-part-content&amp;amp;utm_medium=cta&amp;amp;utm_content=vector+quantization&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Before we dive into the implementation, start by creating a new Maven application. Once the project is set up, open the &lt;code&gt;application.properties&lt;/code&gt; file and add the necessary configuration values for connection with MongoDB and Voyage AI, or add them as environment variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VOYAGE_API_KEY=YOUR_VOYAGE_AI_API_KEY
MONGODB_URI=YOUR_MONGODB_CONNECTION_STRING
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll need to bring in our dependencies to our &lt;code&gt;pom.xml&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependencies&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;junit&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;junit&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;4.13.2&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;scope&amp;gt;&lt;/span&gt;test&lt;span class="nt"&gt;&amp;lt;/scope&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.fasterxml.jackson.core&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;jackson-databind&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;2.19.2&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.mongodb&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;mongodb-driver-sync&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;5.3.1&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.slf4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;slf4j-api&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;2.0.16&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.slf4j&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;slf4j-simple&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;2.0.16&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;scope&amp;gt;&lt;/span&gt;test&lt;span class="nt"&gt;&amp;lt;/scope&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.json&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;json&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;20250517&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.squareup.okhttp3&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;okhttp&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;4.12.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependencies&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s quickly break down what each of these dependencies is doing in our project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JUnit: Provides a simple testing framework so we can write unit tests for our application. While not strictly required for this tutorial, it’s good practice to keep tests in place as you iterate.
&lt;/li&gt;
&lt;li&gt;Jackson Databind: Used to map JSON responses (like the embeddings we get back from Voyage AI) into Java objects. It also makes it easy to serialize Java objects into JSON when needed.
&lt;/li&gt;
&lt;li&gt;MongoDB Driver Sync: The official synchronous Java Driver for MongoDB. This is how our Java application will connect to Atlas, insert documents, and run vector search queries.
&lt;/li&gt;
&lt;li&gt;SLF4J API and SLF4J Simple: SLF4J is a logging facade that gives us a consistent API for logging. The “Simple” binding is a lightweight implementation that prints logs to the console. Together, these help us see what’s going on under the hood when we run our app.
&lt;/li&gt;
&lt;li&gt;JSON: A lightweight library for constructing and manipulating JSON. We’ll use this when parsing small JSON objects or building payloads, especially when interacting with the embedding API.
&lt;/li&gt;
&lt;li&gt;OkHttp: A modern, efficient HTTP client for Java. We’ll use this to make requests to the Voyage AI API to fetch embeddings for our sample data.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setting up our main class
&lt;/h2&gt;

&lt;p&gt;For the sake of simplicity, we'll keep the logic for this application in the one &lt;code&gt;Main&lt;/code&gt; class. This &lt;code&gt;Main&lt;/code&gt; class sets up everything the app needs before we write any logic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Environment-driven config: &lt;code&gt;VOYAGE_API_KEY&lt;/code&gt; and &lt;code&gt;MONGODB_URI&lt;/code&gt; are read from environment variables so you don’t hardcode secrets. If either is missing, the app fails fast with a clear error.
&lt;/li&gt;
&lt;li&gt;Project constants: Database/collection/index names (&lt;code&gt;test&lt;/code&gt;, &lt;code&gt;demo&lt;/code&gt;, &lt;code&gt;vector_index&lt;/code&gt;) are what we'll use to connect to our MongoDB database and configure our vector search index.
&lt;/li&gt;
&lt;li&gt;HTTP details: &lt;code&gt;VOYAGE_API_URL&lt;/code&gt; points at Voyage AI’s embeddings endpoint. The connection/read timeouts keep our HTTP calls predictable.
&lt;/li&gt;
&lt;li&gt;Toy dataset + query: &lt;code&gt;DATA&lt;/code&gt; is a small list of documents we’ll embed and store; &lt;code&gt;QUERY_TEXT&lt;/code&gt; is what we’ll use to run a vector search later. I was not too creative here, but you can absolutely plug in your own data, and even make the query text interactive if you want to play around with it.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Main&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="c1"&gt;// Configurations&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;VOYAGE_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"VOYAGE_API_KEY"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;MONGODB_URI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"MONGODB_URI"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;DB_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"test"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;COLLECTION_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"demo"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"vector_index"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt; &lt;span class="no"&gt;OBJECT_MAPPER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Voyage AI API Endpoint&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;VOYAGE_API_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.voyageai.com/v1/embeddings"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Timeout values for API requests&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;CONNECTION_TIMEOUT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;READ_TIMEOUT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Sample Data&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;DATA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"The Great Wall of China is visible from space."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"The Eiffel Tower was completed in Paris in 1889."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"Mount Everest is the highest peak on Earth at 8,848m."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"Shakespeare wrote 37 plays and 154 sonnets during his lifetime."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"The Mona Lisa was painted by Leonardo da Vinci."&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;QUERY_TEXT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Country landmarks"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;VOYAGE_API_KEY&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="no"&gt;VOYAGE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"API key not found. Set VOYAGE_API_KEY in your environment."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;MONGODB_URI&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="no"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"MongoDB URI not found. Set MONGODB_URI in your environment."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// we'll be adding more code here&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// and here&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ll add the HTTP call to Voyage, the MongoDB write, and the vector search inside &lt;code&gt;main&lt;/code&gt; where the comments indicate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Response records
&lt;/h3&gt;

&lt;p&gt;We are going to create two records to handle our responses from the Voyage AI API. The first record is going to be called &lt;code&gt;ResponseDouble&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When we use automatic quantization in MongoDB Atlas, we send and store float vectors (received as doubles in Java). This record mirrors the Voyage AI JSON shape for float embeddings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;data[i].embedding&lt;/code&gt; is &lt;code&gt;List&amp;lt;Double&amp;gt;&lt;/code&gt;, which maps cleanly to MongoDB’s &lt;code&gt;array&amp;lt;double&amp;gt;&lt;/code&gt; (what Atlas expects when it will auto-quantize for us).
&lt;/li&gt;
&lt;li&gt;This path keeps the original full-fidelity vectors available for rescoring while the index stores the quantized form in RAM.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;ResponseDouble&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;  
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;DataItem&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
    &lt;span class="nc"&gt;Usage&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;DataItem&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// stored as doubles in MongoDB  &lt;/span&gt;
                &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;  
        &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  
        &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;Usage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;total_tokens&lt;/span&gt;  
        &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll use this model whenever we plan to enable &lt;code&gt;quantization.enabled: true&lt;/code&gt; with &lt;code&gt;"type": "scalar"&lt;/code&gt; or &lt;code&gt;"binary"&lt;/code&gt; in our MongoDB Atlas Search index definition. We'll go more into depth with this later in the tutorial.&lt;/p&gt;

&lt;p&gt;For pre-quantized ingestion, some embedding providers return reduced-precision vectors directly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;int8&lt;/code&gt; (scalar-quantized) → a &lt;code&gt;byte[]&lt;/code&gt; per vector
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;int1&lt;/code&gt; (binary-quantized/packed bits) → also a &lt;code&gt;byte[]&lt;/code&gt;, but bit-packed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We are going to set up a &lt;code&gt;ResponseBytes&lt;/code&gt; record for this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;ResponseBytes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;DataItem&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;Usage&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;DataItem&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// int8 values or packed bits  &lt;/span&gt;
            &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;  
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;Usage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;total_tokens&lt;/span&gt;  
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This record captures that response shape. When we insert into MongoDB, we’ll store the bytes as BSON &lt;code&gt;binData&lt;/code&gt; so Atlas can index them as quantized vectors without extra conversion. One of the reasons we are using Voyage AI here is because we can get these pre-quantized values returned. If you are using a different model, your responses will look different, but there are numerous other models out there that can pre-quantize our vectors.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to enable automatic quantization of vectors
&lt;/h2&gt;

&lt;p&gt;We can configure MongoDB Atlas Vector Search to automatically quantize float vector embeddings in our collection to reduced representation types, such as &lt;code&gt;int8&lt;/code&gt; (scalar) and &lt;code&gt;binary&lt;/code&gt; in our vector indexes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fields"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vector"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;field-to-index&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"numDimensions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;lt;number-of-dimensions&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"similarity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"euclidean | cosine | dotProduct"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"quantization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"none | scalar | binary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hnswOptions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"maxEdges"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;lt;number-of-connected-neighbors&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"numEdgeCandidates"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;lt;number-of-nearest-neighbors&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"filter"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;field-to-index&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To set or change the quantization type, we specify a &lt;code&gt;quantization&lt;/code&gt; field value of either &lt;code&gt;scalar&lt;/code&gt; or &lt;code&gt;binary&lt;/code&gt; in our index definition. This triggers an index rebuild similar to any other index definition change. The specified quantization type applies to all indexed vectors and query vectors at query-time. We don't need to change our query as our query vectors are automatically quantized. We'll create our index in the code using the MongoDB Java Driver, but first, we'll create a method to get our embeddings for our data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Embedding our data
&lt;/h3&gt;

&lt;p&gt;Underneath our main method, we'll create a method &lt;code&gt;embedDataAndCreateDocument&lt;/code&gt; that will take in a list of strings (our data to embed) and return a list of BSON documents with all of our embeddings ready to be queried using our automatic scalar or binary quantized vectors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;embedDataAndCreateDocument&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;connectTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;CONNECTION_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;READ_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt; &lt;span class="n"&gt;baselineValues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedBaselineDoubles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;   

        &lt;span class="c1"&gt;// Create a list of BSON documents containing all embeddings  &lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;createCombinedEmbeddingsDocument&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;baselineValues&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error fetching embedding: "&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First, we'll create an &lt;code&gt;OkHttpClient&lt;/code&gt; to connect to our Voyage AI API. We'll then query our API and return our response to our &lt;code&gt;ResponseDouble&lt;/code&gt; record. We'll define our &lt;code&gt;embedBaselineDoubles&lt;/code&gt; method to take in this client and the data to embed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting our embeddings with Voyage AI
&lt;/h3&gt;

&lt;p&gt;Now, we are going to need a method to communicate with the Voyage AI API. Let's define a method &lt;code&gt;sendRequest()&lt;/code&gt; that will allow us to do this. We are going to reuse this method throughout this tutorial, so we will add configuration for the output type in the method parameters, along with the client and data to embed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Send API request to Voyage AI  &lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;sendRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;requestBody&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSONObject&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"input"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"model"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"voyage-3-large"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"input_type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"document"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"output_dtype"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"output_dimension"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="nc"&gt;Request&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;VOYAGE_API_URL&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RequestBody&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;requestBody&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;MediaType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"application/json"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addHeader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Authorization"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Bearer "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="no"&gt;VOYAGE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Response&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newCall&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;execute&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isSuccessful&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IOException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"API error: HTTP "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" - "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IOException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Empty response body"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;string&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Raw API JSON:\n"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll also be defining the embedding model to use, the input data type, and the output dimensions we need. It's important that the output dimension we use uses the same number as the dimensions we will specify in our vector search index.&lt;/p&gt;

&lt;p&gt;After this, we can define our method &lt;code&gt;embedBaselineDoubles&lt;/code&gt;. We will use this to call our &lt;code&gt;sendRequest&lt;/code&gt; and define our output data type to float. In MongoDB, we'll be storing this output as doubles, hence the naming conventions I use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt; &lt;span class="nf"&gt;embedBaselineDoubles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"float"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;OBJECT_MAPPER&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sendRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This method takes a list of strings, sends them to our embedding service requesting the float-based embeddings, and then deserializes the JSON response into a &lt;code&gt;ResponseDouble&lt;/code&gt; object.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating our BSON document
&lt;/h3&gt;

&lt;p&gt;Lastly, we can define our method &lt;code&gt;createCombinedEmbeddingsDocument&lt;/code&gt; that takes these embeddings and converts them to our document to store in MongoDB, that provides our response for the &lt;code&gt;embedDataAndCreateDocument&lt;/code&gt; method.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Create a list of BSON document containing all embedding responses  &lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;createCombinedEmbeddingsDocument&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt; 
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embeddingsList&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;embeddingDoc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_float32"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_auto_scalar"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_auto_binary"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
        &lt;span class="n"&gt;embeddingsList&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingDoc&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;embeddingsList&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This method takes our texts and embeddings and builds MongoDB documents that include three parallel embedding fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One for float32 storage
&lt;/li&gt;
&lt;li&gt;One for auto scalar quantization
&lt;/li&gt;
&lt;li&gt;One for auto binary quantization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even though they all store the same raw values right now, MongoDB will use different index definitions (float32, scalar quantized, binary quantized) depending on which field we target.&lt;/p&gt;

&lt;h3&gt;
  
  
  Storing in MongoDB
&lt;/h3&gt;

&lt;p&gt;After all this, we can define our &lt;code&gt;storeAllEmbeddings&lt;/code&gt; method. This will take in our &lt;code&gt;MongoClient&lt;/code&gt; we define in our main method, as well as the list of embeddings, and insert them into the database and collection we define as constants at the start of our class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;storeAllEmbeddings&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;insertMany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; 
&lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Inserted embeddings documents into MongoDB"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can create our embeddings, create our documents, and store them in MongoDB, but how can we use vector search to query our data? Well, we need to define a vector search index.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating our Vector Search index
&lt;/h3&gt;

&lt;p&gt;We'll create a method &lt;code&gt;setupVectorSearchIndex&lt;/code&gt; that we can define our index:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Create the Vector Search index  &lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setupVectorSearchIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;MongoDatabase&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDatabase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DB_NAME&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;Bson&lt;/span&gt; &lt;span class="n"&gt;definition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="s"&gt;"fields"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_float32"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"dotProduct"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_auto_scalar"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"dotProduct"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"quantization"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"scalar"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_auto_binary"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"dotProduct"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"quantization"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"binary"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;SearchIndexModel&lt;/span&gt; &lt;span class="n"&gt;indexModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SearchIndexModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="n"&gt;definition&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nc"&gt;SearchIndexType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
    &lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createSearchIndexes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;singletonList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexModel&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

    &lt;span class="n"&gt;waitForIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This method wires up the vector search index so we can actually query our embeddings. We grab the collection, then define an index with three vector fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;embeddings_float32&lt;/code&gt;: Raw float vectors (stored as doubles in BSON). No quantization.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;embeddings_auto_scalar&lt;/code&gt;: Same vectors, but the index applies automatic scalar quantization.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;embeddings_auto_binary&lt;/code&gt;: Same vectors, but with automatic binary quantization.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each one uses &lt;code&gt;numDimensions: 1024&lt;/code&gt; and &lt;code&gt;similarity: dotProduct&lt;/code&gt;. You’d adjust &lt;code&gt;numDimensions&lt;/code&gt; if your model outputs a different size.&lt;/p&gt;

&lt;p&gt;The reason for three fields is simple: It lets you compare trade-offs.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No quantization gives the highest recall but eats the most memory.
&lt;/li&gt;
&lt;li&gt;Scalar cuts memory roughly to a quarter while keeping recall high.
&lt;/li&gt;
&lt;li&gt;Binary is the leanest and fastest, but recall can dip depending on the data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MongoDB handles the quantization inside the index. In your documents, you still just store doubles.&lt;/p&gt;

&lt;p&gt;Then, we can add our &lt;code&gt;waitForIndex&lt;/code&gt; method that will poll the database until the build is complete before moving on:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Wait for the index build to complete  &lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;waitForIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;startTime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nanoTime&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;timeoutNanos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toNanos&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nanoTime&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;startTime&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;timeoutNanos&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;indexRecord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StreamSupport&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;listSearchIndexes&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;spliterator&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAny&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;orElse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexRecord&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"FAILED"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexRecord&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"status"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
                &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Search index has FAILED status."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="o"&gt;}&lt;/span&gt;  
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexRecord&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBoolean&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"queryable"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexName&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" index is ready to query"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
                &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
            &lt;span class="o"&gt;}&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// busy-wait, avoid in production  &lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;interrupt&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Query with Vector Search
&lt;/h3&gt;

&lt;p&gt;To query the data, we will define a &lt;code&gt;runVectorSearchQueries&lt;/code&gt; method. We'll be using this method again later to query pre-quantized vectors, but for now, we will just prepare it to query our data that uses the automatically quantized data (as well as a baseline that uses no quantization):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Run MongoDB vector search queries for all fields using query embeddings  &lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;runVectorSearchQueries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;connectTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;CONNECTION_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;READ_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;queryInput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;QUERY_TEXT&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; 
        &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt; &lt;span class="n"&gt;qFloatResp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedBaselineDoubles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queryInput&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;qFloat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;qFloatResp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getFirst&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="n"&gt;runVectorSearchDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qFloat&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error embedding query vector: "&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First, we will create an embedding for our query using the &lt;code&gt;embedBaselineDoubles&lt;/code&gt; method from earlier. We can extract our vectors from the response. Next, we'll call a &lt;code&gt;runVectorSearchDoubles&lt;/code&gt; method. Let's create this method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;runVectorSearchDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;floatQuery&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;embeddingType&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_float32"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_auto_scalar"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_auto_binary"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bson&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                        &lt;span class="n"&gt;fieldPath&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingType&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                        &lt;span class="n"&gt;floatQuery&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                        &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                        &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;approximateVectorSearchOptions&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                        &lt;span class="n"&gt;exclude&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"_id"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                        &lt;span class="n"&gt;include&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                        &lt;span class="n"&gt;metaVectorSearchScore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="o"&gt;))));&lt;/span&gt;  

        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;into&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;());&lt;/span&gt;  

        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Outputting results: "&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We loop through each embedding type, &lt;code&gt;embeddings_float32&lt;/code&gt;, &lt;code&gt;embeddings_auto_scalar&lt;/code&gt;, and &lt;code&gt;embeddings_auto_binary&lt;/code&gt;. For each one, we build a pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;vectorSearch&lt;/code&gt; stage

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;fieldPath(embeddingType)&lt;/code&gt; tells MongoDB which embedding field to search.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;floatQuery&lt;/code&gt; is our query vector (from the earlier &lt;code&gt;embedBaselineDoubles&lt;/code&gt; method).
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;INDEX_NAME&lt;/code&gt; points to the vector search index we created.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2&lt;/code&gt; means return the top two matches.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;approximateVectorSearchOptions(5)&lt;/code&gt; says “use ANN with a candidate pool of 5.” That balances speed versus accuracy.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;project&lt;/code&gt; stage

&lt;ul&gt;
&lt;li&gt;Excludes &lt;code&gt;_id&lt;/code&gt; (so results are cleaner).
&lt;/li&gt;
&lt;li&gt;Includes just &lt;code&gt;text&lt;/code&gt; and the computed &lt;code&gt;vectorSearchScore&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We then run the aggregation with &lt;code&gt;collection.aggregate(pipeline)&lt;/code&gt; and dump the results into a list. Finally, the results are printed, showing each document’s text and score.&lt;/p&gt;

&lt;p&gt;Now, let's update our main method to include all the necessary calls to our methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Main&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="c1"&gt;// Configurations&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;VOYAGE_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"VOYAGE_API_KEY"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;MONGODB_URI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"MONGODB_URI"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;DB_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"test"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;COLLECTION_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"demo"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"vector_index"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt; &lt;span class="no"&gt;OBJECT_MAPPER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Voyage AI API Endpoint&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;VOYAGE_API_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.voyageai.com/v1/embeddings"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Timeout values for API requests&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;CONNECTION_TIMEOUT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;READ_TIMEOUT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Sample Data&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;DATA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"The Great Wall of China is visible from space."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"The Eiffel Tower was completed in Paris in 1889."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"Mount Everest is the highest peak on Earth at 8,848m."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"Shakespeare wrote 37 plays and 154 sonnets during his lifetime."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"The Mona Lisa was painted by Leonardo da Vinci."&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;QUERY_TEXT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Country landmarks"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;VOYAGE_API_KEY&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="no"&gt;VOYAGE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"API key not found. Set VOYAGE_API_KEY in your environment."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;MONGODB_URI&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="no"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"MongoDB URI not found. Set MONGODB_URI in your environment."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// create embeddings and collect all responses&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Embedding data..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;allEmbeddingsDocuments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedDataAndCreateDocument&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DATA&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoClients&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;MongoDatabase&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDatabase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DB_NAME&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="c1"&gt;// store all embeddings in mongodb&lt;/span&gt;
            &lt;span class="n"&gt;storeAllEmbeddings&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;allEmbeddingsDocuments&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="c1"&gt;// create vector search index&lt;/span&gt;
            &lt;span class="n"&gt;setupVectorSearchIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Querying data..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;runVectorSearchQueries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While it is perfectly fine to run this now, and we will use the full functionality of the auto-quantization with MongoDB, we are going to walk through how to update this class to handle pre-quantized vectors.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to ingest pre-quantized vectors
&lt;/h2&gt;

&lt;p&gt;Pre-quantized vectors are supported by some embedding models, such as Voyage AI’s &lt;code&gt;voyage-3-large&lt;/code&gt;. Using them means the model itself does the quantization step for us, so we can store and index the compressed representation directly instead of relying on MongoDB to apply automatic quantization during index build.&lt;/p&gt;

&lt;p&gt;This has two main benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smaller payloads from the API: Instead of receiving large float32 arrays, we get compact &lt;code&gt;int8&lt;/code&gt; or even &lt;code&gt;int1&lt;/code&gt; arrays.
&lt;/li&gt;
&lt;li&gt;Less memory overhead in MongoDB: We store the already-quantized representation, which keeps both storage size and index RAM cost lower.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MongoDB supports ingestion of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;int8&lt;/code&gt; vectors (stored as BSON &lt;code&gt;binData(int8)&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;int1&lt;/code&gt; vectors (stored as BSON &lt;code&gt;binData(int1)&lt;/code&gt; with packed bits).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unlike the auto-quantization paths, these do not require that we define our index specifically for &lt;code&gt;int8&lt;/code&gt; or &lt;code&gt;int1&lt;/code&gt; ingestion.&lt;/p&gt;

&lt;h3&gt;
  
  
  Embedding our data
&lt;/h3&gt;

&lt;p&gt;Now, to implement this, we will update our code from earlier. First, in &lt;code&gt;embedDataAndCreateDocument&lt;/code&gt;, we will update it to embed our data to &lt;code&gt;int8&lt;/code&gt; and &lt;code&gt;int1&lt;/code&gt; and return to our &lt;code&gt;ResponseBytes&lt;/code&gt; record.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;embedDataAndCreateDocument&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;connectTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;CONNECTION_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;READ_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt; &lt;span class="n"&gt;baselineValues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedBaselineDoubles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="n"&gt;preQuantizedInt8&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedPreQuantizedInt8&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="n"&gt;preQuantizedInt1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedPreQuantizedInt1&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="c1"&gt;// Create a list of BSON documents containing all embeddings  &lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;createCombinedEmbeddingsDocument&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;baselineValues&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;preQuantizedInt8&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;preQuantizedInt1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error fetching embedding: "&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll be using two new methods to get our pre-quantized data—first, &lt;code&gt;embedPreQuantizedInt8&lt;/code&gt;, which works much like our &lt;code&gt;embedBaselineDouble&lt;/code&gt;, but passes in the output data type &lt;code&gt;int8&lt;/code&gt; to tell Voyage AI that we want pre-quantized scalar values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="nf"&gt;embedPreQuantizedInt8&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"int8"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;OBJECT_MAPPER&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sendRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We then need an &lt;code&gt;embedPreQuantizedInt1&lt;/code&gt; method to pass in the output type &lt;code&gt;ubinary&lt;/code&gt; to get our packed bits response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="nf"&gt;embedPreQuantizedInt1&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ubinary"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;OBJECT_MAPPER&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sendRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outputDataType&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How to store pre-quantized data in MongoDB
&lt;/h3&gt;

&lt;p&gt;We also need to update our &lt;code&gt;createCombinedEmbeddingsDocument&lt;/code&gt; to take in these new embeddings to store in MongoDB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Create a list of BSON document containing all embedding responses  &lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;createCombinedEmbeddingsDocument&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="n"&gt;int8Response&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="n"&gt;int1Response&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embeddingsList&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;embeddingDoc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_float32"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_auto_scalar"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_auto_binary"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;floatResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_int8"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;BinaryVector&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;int8Vector&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;int8Response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_int1"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;BinaryVector&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;packedBitVector&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;int1Response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
        &lt;span class="n"&gt;embeddingsList&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingDoc&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;embeddingsList&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We still store doubles for the auto paths and we now store pre-quantized bytes for &lt;code&gt;int8&lt;/code&gt;/&lt;code&gt;int1&lt;/code&gt;. MongoDB Atlas treats them as quantized at ingestion, so no conversion.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;org.bson.BinaryVector&lt;/code&gt; is a helper type in the MongoDB Java Driver.  It wraps raw &lt;code&gt;byte[]&lt;/code&gt; so MongoDB knows we’re storing a vector type (&lt;code&gt;binData(int8)&lt;/code&gt; or &lt;code&gt;binData(int1)&lt;/code&gt;) instead of just a random blob of bytes.&lt;/p&gt;

&lt;p&gt;MongoDB Atlas Vector Search requires these wrapped types to understand how to treat the data when building a vector index.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;BinaryVector.int8Vector(...)&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Takes the &lt;code&gt;byte[]&lt;/code&gt; from our embedding model that already represents int8-quantized values.
&lt;/li&gt;
&lt;li&gt;Wraps it as a BSON &lt;code&gt;binData(int8)&lt;/code&gt; vector.
&lt;/li&gt;
&lt;li&gt;MongoDB will store this as a compact 8-bit signed integer vector, and the index will consume it directly with no further conversion.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;BinaryVector.packedBitVector(..., (byte) 0)&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Takes the &lt;code&gt;byte[]&lt;/code&gt; returned for 1-bit embeddings (&lt;code&gt;ubinary&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;These vectors are bit-packed. Each bit represents one dimension of the embedding (&lt;code&gt;0&lt;/code&gt; if ≤ midpoint, &lt;code&gt;1&lt;/code&gt; if &amp;gt; midpoint).
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BinaryVector.packedBitVector&lt;/code&gt; wraps that array and tags it as BSON &lt;code&gt;binData(int1)&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;(byte) 0&lt;/code&gt; is a padding value. If the vector’s number of dimensions isn’t an exact multiple of eight, MongoDB Atlas needs to know how to pad the unused bits in the last byte (usually &lt;code&gt;0&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Updating our Vector Search index
&lt;/h3&gt;

&lt;p&gt;Now, we need to update the &lt;code&gt;setupVectorSearchIndex&lt;/code&gt; to handle these new fields, and prepare them for vector search:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Create the Vector Search index  &lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setupVectorSearchIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
       &lt;span class="nc"&gt;Bson&lt;/span&gt; &lt;span class="n"&gt;definition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="s"&gt;"fields"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_float32"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"dotProduct"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_auto_scalar"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"dotProduct"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"quantization"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"scalar"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_auto_binary"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"dotProduct"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"quantization"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"binary"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_int8"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"dotProduct"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vector"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"embeddings_int1"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"numDimensions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"similarity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"euclidean"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;SearchIndexModel&lt;/span&gt; &lt;span class="n"&gt;indexModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SearchIndexModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="n"&gt;definition&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nc"&gt;SearchIndexType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
    &lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createSearchIndexes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;singletonList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indexModel&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

    &lt;span class="n"&gt;waitForIndex&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've added our two new fields, &lt;code&gt;embeddings_int8&lt;/code&gt; and &lt;code&gt;embeddings_int1&lt;/code&gt;, to our index definition. We are still using the same number of dimensions for our embeddings, as this is defined by the model used for embedding, There are two big differences.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We do not need to define quantization.
&lt;/li&gt;
&lt;li&gt;For &lt;code&gt;int1&lt;/code&gt; vector quantization, we use the euclidean similarity method for searching. This is because &lt;code&gt;int1&lt;/code&gt; does not support &lt;code&gt;dotProduct&lt;/code&gt; or &lt;code&gt;cosine&lt;/code&gt; similarity methods.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you have run the code before, and this index already exists on the database, you will need to delete it, or manually update it to avoid any duplicate index errors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Searching pre-quantized vectors
&lt;/h3&gt;

&lt;p&gt;Now we have our new, pre-quantized data ready, we need to update our &lt;code&gt;runVectorSearchQueries&lt;/code&gt; method to embed our queries to these pre-quantised values.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Run MongoDB vector search queries for all fields using query embeddings  &lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;runVectorSearchQueries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OkHttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;connectTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;CONNECTION_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readTimeout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;READ_TIMEOUT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;queryInput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;QUERY_TEXT&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; 
        &lt;span class="nc"&gt;ResponseDouble&lt;/span&gt; &lt;span class="n"&gt;qFloatResp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedBaselineDoubles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queryInput&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;qFloat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;qFloatResp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getFirst&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="n"&gt;qInt8Resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedPreQuantizedInt8&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queryInput&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;BinaryVector&lt;/span&gt; &lt;span class="n"&gt;qInt8&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BinaryVector&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;int8Vector&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;qInt8Resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getFirst&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  

        &lt;span class="nc"&gt;ResponseBytes&lt;/span&gt; &lt;span class="n"&gt;qInt1Resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedPreQuantizedInt1&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queryInput&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;BinaryVector&lt;/span&gt; &lt;span class="n"&gt;qInt1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BinaryVector&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;packedBitVector&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;qInt1Resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getFirst&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="n"&gt;runVectorSearchDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qFloat&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="n"&gt;runVectorSearchBinary&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qInt8&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qInt1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error embedding query vector: "&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll use our &lt;code&gt;embedPreQuantizedInt8&lt;/code&gt; and &lt;code&gt;embedPreQuantizedInt1&lt;/code&gt; methods to get our embeddings for our query.&lt;/p&gt;

&lt;p&gt;Lastly, we need a new method &lt;code&gt;runVectorSearchBinary&lt;/code&gt; method to query MongoDB with these values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;runVectorSearchBinary&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
        &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;BinaryVector&lt;/span&gt; &lt;span class="n"&gt;qInt8&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
        &lt;span class="nc"&gt;BinaryVector&lt;/span&gt; &lt;span class="n"&gt;qInt1&lt;/span&gt;  
&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;BinaryVector&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;queryByField&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LinkedHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  
    &lt;span class="n"&gt;queryByField&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_int8"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qInt8&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// int8  &lt;/span&gt;
    &lt;span class="n"&gt;queryByField&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embeddings_int1"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qInt1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// packed 1-bit  &lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Entry&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;BinaryVector&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;queryByField&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;entrySet&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bson&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                        &lt;span class="n"&gt;fieldPath&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getKey&lt;/span&gt;&lt;span class="o"&gt;()),&lt;/span&gt;  
                        &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getValue&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;  
                        &lt;span class="no"&gt;INDEX_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                        &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;approximateVectorSearchOptions&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                        &lt;span class="n"&gt;exclude&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"_id"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                        &lt;span class="n"&gt;include&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                        &lt;span class="n"&gt;metaVectorSearchScore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="o"&gt;))));&lt;/span&gt;  

        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;into&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;());&lt;/span&gt;  

        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Outputting results: "&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running our application
&lt;/h2&gt;

&lt;p&gt;We can now run our application with the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn clean compile &lt;span class="nb"&gt;exec&lt;/span&gt;:java &lt;span class="nt"&gt;-Dexec&lt;/span&gt;.mainClass&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"com.timkelly.Main"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we’ll see our results printed to the console:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;Outputting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;results:&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Great Wall of China is visible from space."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8666857481002808&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Mount Everest is the highest peak on Earth at 8,848m."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8523434400558472&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Outputting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;results:&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Great Wall of China is visible from space."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8664929866790771&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Mount Everest is the highest peak on Earth at 8,848m."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8526520729064941&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Outputting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;results:&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Great Wall of China is visible from space."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8666857481002808&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Mount Everest is the highest peak on Earth at 8,848m."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.8523434400558472&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Outputting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;results:&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Great Wall of China is visible from space."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.5071338415145874&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Mount Everest is the highest peak on Earth at 8,848m."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.5067881345748901&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Outputting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;results:&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Great Wall of China is visible from space."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.771484375&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Mona Lisa was painted by Leonardo da Vinci."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vectorSearchScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.7451171875&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’ll notice that the various methods we used (baseline float32, automatic scalar, automatic binary, and the pre-quantized int8/int1 paths) all return slightly different results and scores.&lt;/p&gt;

&lt;p&gt;That’s the trade-off in action:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Float32 (no quantization): Highest fidelity, most memory usage.
&lt;/li&gt;
&lt;li&gt;Auto-scalar: Scores almost identical to float32, but RAM cost is a fraction.
&lt;/li&gt;
&lt;li&gt;Auto-binary: Tighter memory, recall dips slightly depending on the dataset.
&lt;/li&gt;
&lt;li&gt;Pre-quantized int8/int1: Smallest payloads, fastest index use, but you can see the scores shift more noticeably.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why MongoDB gives us all these options. We can pick the right balance of accuracy, memory footprint, and query latency for our application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With this setup, we’ve seen how to embed data, store it in MongoDB, and query it with Atlas Vector Search across multiple quantization strategies. Automatic quantization makes it easy to reduce memory while keeping high recall, while pre-quantized vectors give you the smallest payloads and fastest queries when your embedding model supports them. The trade-offs between accuracy, resource usage, and speed become clear once you run the same query through each index path.&lt;/p&gt;

&lt;p&gt;By experimenting with float32, scalar, binary, int8, and int1 vectors side by side, you can choose the right approach for your application’s needs, whether that’s maximum recall, minimal resource footprint, or a balance between the two.&lt;/p&gt;

&lt;p&gt;If you found this tutorial useful, and want to learn more about what you can do with MongoDB in Java, check out &lt;a href="https://dev.to/mongodb/turning-my-obsidian-vault-into-a-searchable-wiki-with-spring-boot-ef8"&gt;how I turned my Obsidian vault into a searchable wiki&lt;/a&gt; or &lt;a href="https://dev.to/mongodb/secure-local-rag-with-role-based-access-spring-ai-ollama-mongodb-1gj"&gt;Secure Local RAG with Role-Based Access: Spring AI, Ollama &amp;amp; MongoDB&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>java</category>
      <category>ai</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>Turning My Obsidian Vault Into a Searchable Wiki With Spring Boot and MongoDB</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Wed, 13 Aug 2025 10:31:36 +0000</pubDate>
      <link>https://forem.com/mongodb/turning-my-obsidian-vault-into-a-searchable-wiki-with-spring-boot-ef8</link>
      <guid>https://forem.com/mongodb/turning-my-obsidian-vault-into-a-searchable-wiki-with-spring-boot-ef8</guid>
      <description>&lt;p&gt;Like any note-taker worth their salt, I’ve built up a cumbersome, often neglected pile of half-formed thoughts or ideas that crossed my mind throughout the day. I’ve always believed there were nuggets of value buried in that chaos.&lt;/p&gt;

&lt;p&gt;Among these notes are recipes, tech experiments, a plethora of computer science knowledge from college, and many follow standardized templates I created. I wanted to share them online through a blog, making them searchable for anyone who might find them useful.&lt;/p&gt;

&lt;p&gt;So instead of dumping these into a CMS by hand, I decided to repurpose my &lt;a href="https://obsidian.md/" rel="noopener noreferrer"&gt;Obsidian&lt;/a&gt; vault into a searchable wiki I can embed on my blog. Since everything is already in Markdown and includes metadata in &lt;a href="https://dev.to/dailydevtips1/what-exactly-is-frontmatter-123g"&gt;frontmatter&lt;/a&gt;, I could skip the copy-paste slog and get straight to building something useful.&lt;/p&gt;

&lt;p&gt;This tutorial walks through how I turned my local notes into a web-accessible knowledge base, with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MongoDB Atlas Search for full-text, fuzzy, and autocomplete querying.
&lt;/li&gt;
&lt;li&gt;Spring Boot for ingesting and formatting data, as well as the REST API.
&lt;/li&gt;
&lt;li&gt;Custom weighting and boosting to prioritise my best and most useful notes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ll be indexing my own technical notes on some frameworks as a demo, and I’ll show you how to expose and query them through an API, so you can do the same with yours.&lt;/p&gt;

&lt;p&gt;If you just want the code, it's all available in this &lt;a href="https://github.com/mongodb-developer/SpringSearch" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Obsidian?
&lt;/h2&gt;

&lt;p&gt;Obsidian is a free note-taking app that lets you write and organize your thoughts in plain text using Markdown. It’s designed for people who want to build a personal knowledge base, with features like backlinks and graph views, and it’s designed to help users organize and structure their thoughts and knowledge in a flexible, non-linear way.&lt;/p&gt;

&lt;p&gt;I like Obsidian as a note-taking app because it allows a lot of customization, and it’s really just a wrapper for viewing Markdown files. That means I never have to worry about my notes being locked away if Obsidian disappears. They’re still just plain text on disk. It also has a vast community plugin ecosystem and makes it easy to build your own, so I can optimize my workflow exactly how I want.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why MongoDB Atlas Search makes sense
&lt;/h2&gt;

&lt;p&gt;Since Obsidian stores everything as plain Markdown with optional frontmatter, the structure maps cleanly to documents in MongoDB. I don’t have to force a schema onto my notes and can store multiple formats in the same &lt;a href="https://www.mongodb.com/docs/manual/core/databases-and-collections/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;collection&lt;/a&gt;. Each one can have different tags, metadata, or headings, and MongoDB’s flexible model handles that without complaint. But the real reason this setup works so well is &lt;a href="https://www.mongodb.com/products/platform/atlas-search??utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas Search&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It gives us &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/tutorial/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;full-text search&lt;/a&gt;, &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/autocomplete/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;autocomplete&lt;/a&gt;, and &lt;a href="https://www.mongodb.com/resources/basics/fuzzy-match?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;fuzzy matching&lt;/a&gt;, all built in, no separate indexing service or extra infrastructure needed. We can even &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/score/modify-score/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;boost certain fields&lt;/a&gt; to prioritize titles, tags, or whatever matters most to us. Want to weigh title matches higher than body content? Or prioritize notes with certain tags? Totally up to us.&lt;/p&gt;

&lt;p&gt;The core of it all is the &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/query-syntax/#std-label-search-stage?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;&lt;code&gt;$search&lt;/code&gt; stage&lt;/a&gt;. For something as messy and inconsistent as a personal vault of thoughts, it’s more than robust enough to build the search for my wiki.&lt;/p&gt;

&lt;p&gt;With that out of the way, let’s get into how I actually built it, starting with parsing notes and getting them into MongoDB.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before we get started, there are a few things we'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;a href="https://www.mongodb.com/docs/guides/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas account with a cluster set up&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;A free M0 cluster is perfect for this
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Java on your machine, set up and ready to go (I use &lt;a href="https://jdk.java.net/24/" rel="noopener noreferrer"&gt;Java version 24&lt;/a&gt;, but for some of the code, you will need at least Java version 22!)
&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://maven.apache.org/download.cgi" rel="noopener noreferrer"&gt;Maven&lt;/a&gt; (I use version 3.9.10)
&lt;/li&gt;

&lt;li&gt;An &lt;a href="https://obsidian.md/download" rel="noopener noreferrer"&gt;Obsidian vault&lt;/a&gt;, with some notes inside&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Creating our Spring app
&lt;/h2&gt;

&lt;p&gt;To get started, we’ll scaffold a simple Spring Boot project using &lt;a href="https://start.spring.io/" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt;. We’re keeping things minimal, just enough to expose a REST API and connect to MongoDB.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulhkr9z9l07pjs60xo3r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulhkr9z9l07pjs60xo3r.png" alt="Spring initializr screen displaying config described below" width="800" height="606"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here’s what I selected on Spring Initializr:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Project: Maven
&lt;/li&gt;
&lt;li&gt;Language: Java
&lt;/li&gt;
&lt;li&gt;Spring Boot: 3.x (any stable version works)
&lt;/li&gt;
&lt;li&gt;Dependencies:

&lt;ul&gt;
&lt;li&gt;Spring Web—to expose our REST API
&lt;/li&gt;
&lt;li&gt;Spring Data MongoDB—to interact with our MongoDB collection&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Once we've configured those options, give it a name (&lt;code&gt;SpringSearch&lt;/code&gt;) and define the group (&lt;code&gt;com.timkelly&lt;/code&gt;), click Generate, unzip the project, and open it in our IDE of choice.&lt;/p&gt;

&lt;p&gt;From here, we’ll start wiring things up so that Spring can read our notes, store them in MongoDB, and expose them through a clean API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Storing in MongoDB
&lt;/h2&gt;

&lt;p&gt;Before we start importing notes, let’s make sure our app can actually talk to MongoDB.&lt;/p&gt;

&lt;p&gt;In our &lt;code&gt;application.properties&lt;/code&gt;, we add the connection string for our MongoDB Atlas cluster, and specify the database we'd like to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.data.mongodb.uri=YOUR-CONNECTION-STRING
spring.data.mongodb.database=obsidian
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;a href="https://www.mongodb.com/atlas/database?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas&lt;/a&gt;, you can grab your connection string from the cluster dashboard. Just make sure to whitelist your IP and create a database user.&lt;/p&gt;

&lt;p&gt;Now, let’s define a simple repository so Spring Data MongoDB can handle the persistence for us. Create a package repository and in it, we'll define an interface &lt;code&gt;NoteRepository&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.model.Note&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Optional&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Repository&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;NoteRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;findByTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives us built-in CRUD operations out of the box (thanks, &lt;a href="https://docs.spring.io/spring-data/mongodb/docs/current/api/org/springframework/data/mongodb/repository/MongoRepository.html" rel="noopener noreferrer"&gt;&lt;code&gt;MongoRepository&lt;/code&gt;&lt;/a&gt;), and we add one custom method: &lt;code&gt;findByTitle(String title)&lt;/code&gt;, which we’ll use later to check if a note already exists before inserting or updating it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ingesting Obsidian notes
&lt;/h2&gt;

&lt;p&gt;For this tutorial, I’m using a few sample notes on tech frameworks, each following a simple Markdown template, that I have included in the &lt;a href="https://github.com/mongodb-developer/SpringSearch" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;---
tags:
  - List
  - of
  - items
created: YYYY-MM-DD HH:mm
---
# Title
Content Lorem Ipsum

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure gives us a frontmatter block that we’ll treat as metadata. The &lt;code&gt;tags&lt;/code&gt; array defines relevant topics for the note (e.g., &lt;code&gt;AI&lt;/code&gt;, &lt;code&gt;Java&lt;/code&gt;, &lt;code&gt;REST&lt;/code&gt;), and &lt;code&gt;created&lt;/code&gt; marks when the note was written. Below that, we assume a level-one heading (&lt;code&gt;#&lt;/code&gt;) for the title, followed by the main body content.&lt;/p&gt;

&lt;p&gt;If your notes are structured differently, you can adjust the parser to suit. There’s nothing stopping you from using the filename as a title, omitting the frontmatter, or extending the metadata. MongoDB is even happy to handle documents with varied structures in the same &lt;a href="https://www.mongodb.com/docs/manual/core/databases-and-collections/??utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;collection&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;These notes will be mapped to MongoDB documents like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"String"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"String"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"createdAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ISODate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"String"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, in order to interact with MongoDB and map our documents properly, we’ll need a &lt;code&gt;Note&lt;/code&gt; model.&lt;/p&gt;

&lt;p&gt;Create a new class called &lt;code&gt;Note&lt;/code&gt; inside a new package named &lt;code&gt;model&lt;/code&gt; in our Spring app. We’ll use the &lt;code&gt;@Document&lt;/code&gt; annotation from Spring Data MongoDB to let Spring know that this class represents a document in the &lt;code&gt;"notes"&lt;/code&gt; collection.&lt;/p&gt;

&lt;p&gt;Here’s what it looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"notes"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Note&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createdAt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getContent&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setContent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="nf"&gt;getTags&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setTags&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="nf"&gt;getCreatedAt&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCreatedAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createdAt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing fancy—we're defining the fields we want to store in MongoDB, along with a no-args constructor (needed by Spring) and a full constructor for convenience. Then, we generate the usual getters and setters to interact with our object.&lt;/p&gt;

&lt;p&gt;Now that we’ve got a &lt;code&gt;Note&lt;/code&gt; model and a structure we’re happy with, let’s actually load the files from our Obsidian vault and push them into MongoDB.&lt;/p&gt;

&lt;p&gt;First, let's define where our notes live. In our &lt;code&gt;application.properties&lt;/code&gt; or &lt;code&gt;application.yml&lt;/code&gt;, we need to add the following entry pointing to our notes folder. Here’s my dummy data in the &lt;a href="https://github.com/mongodb-developer/SpringSearch" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;, if you’re following along:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;notes.folder.path=dummyData
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we’ll bind that value using a simple Spring &lt;code&gt;@ConfigurationProperties&lt;/code&gt; class. Create a &lt;code&gt;config&lt;/code&gt; package and add the class below, &lt;code&gt;NotesFolderProperties&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.config&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.context.properties.ConfigurationProperties&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Configuration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Configuration&lt;/span&gt;
&lt;span class="nd"&gt;@ConfigurationProperties&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"notes.folder"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;NotesFolderProperties&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getPath&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setPath&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets us inject the file path wherever we need it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Parsing Markdown
&lt;/h3&gt;

&lt;p&gt;After our config, we need some logic to parse our Markdown and extract our data. Since the format of our notes is predictable for this example, converting them into our document structure is straightforward. Below is an implementation of a &lt;code&gt;MarkdownNoteParser&lt;/code&gt; class, which we can place in a new &lt;code&gt;util&lt;/code&gt; package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.util&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.model.Note&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Component&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.yaml.snakeyaml.Yaml&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.text.ParseException&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.text.SimpleDateFormat&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.regex.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Component&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MarkdownNoteParser&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Pattern&lt;/span&gt; &lt;span class="no"&gt;FRONTMATTER_PATTERN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Pattern&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;compile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"^---\\s*\\n(.*?)\\n---\\s*\\n"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Pattern&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DOTALL&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;SimpleDateFormat&lt;/span&gt; &lt;span class="no"&gt;DATE_FORMAT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SimpleDateFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"yyyy-MM-dd HH:mm"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Note&lt;/span&gt; &lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;rawMarkdown&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Matcher&lt;/span&gt; &lt;span class="n"&gt;matcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;FRONTMATTER_PATTERN&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;matcher&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rawMarkdown&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matcher&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;find&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;frontmatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;matcher&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="n"&gt;rawMarkdown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rawMarkdown&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;substring&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matcher&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;end&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
            &lt;span class="nc"&gt;Yaml&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Yaml&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
            &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;yaml&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;load&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frontmatter&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extractTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rawMarkdown&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rawMarkdown&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofNullable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tags"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;?&amp;gt;)&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;Object:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;toArray&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]::&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;orElse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;  

        &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofNullable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"created"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;Object:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;parseDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;orElse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;extractTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;markdown&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Scanner&lt;/span&gt; &lt;span class="n"&gt;scanner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Scanner&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;markdown&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scanner&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasNextLine&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scanner&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextLine&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;startsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"# "&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;substring&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
            &lt;span class="o"&gt;}&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Untitled"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="nf"&gt;parseDate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;dateStr&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;DATE_FORMAT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dateStr&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ParseException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to parse date: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;dateStr&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This class uses &lt;strong&gt;YAML&lt;/strong&gt; to parse the frontmatter metadata block, a &lt;strong&gt;regular expression&lt;/strong&gt; to detect the frontmatter, and a simple scanner to extract the title.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;FRONTMATTER_PATTERN&lt;/code&gt; constant defines a &lt;a href="https://en.wikipedia.org/wiki/Regular_expression" rel="noopener noreferrer"&gt;regex&lt;/a&gt; for identifying YAML frontmatter blocks in the Markdown content. For example, if the frontmatter is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;---
tags:
  - java
  - spring
created: 2025-07-23 12:00
---
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It will be extracted as &lt;code&gt;{tags=[java, spring], created="2025-07-23 12:00"}&lt;/code&gt;, and the remaining content will be stored in &lt;code&gt;rawMarkdown&lt;/code&gt; after the frontmatter is removed.&lt;/p&gt;

&lt;p&gt;Next, we use the &lt;code&gt;extractTitle(rawMarkdown)&lt;/code&gt; helper method to detect the note’s title, provided the Markdown starts with a &lt;code&gt;# Heading&lt;/code&gt;. We also check for a &lt;code&gt;tags&lt;/code&gt; field in the metadata, converting it to a string array if present.&lt;/p&gt;

&lt;p&gt;For the creation date, &lt;code&gt;parseDate(String dateStr)&lt;/code&gt; parses the &lt;code&gt;created&lt;/code&gt; value from the metadata into a &lt;code&gt;Date&lt;/code&gt; object.&lt;/p&gt;

&lt;p&gt;Finally, we trim any extra whitespace from the Markdown body using &lt;code&gt;rawMarkdown.trim()&lt;/code&gt; before returning a fully populated &lt;code&gt;Note&lt;/code&gt; object.&lt;/p&gt;

&lt;h3&gt;
  
  
  Importing from our vault
&lt;/h3&gt;

&lt;p&gt;Now that we can parse our Markdown notes, the next step is to scan an entire folder in our vault, parse every &lt;code&gt;.md&lt;/code&gt; file, and insert or update each note in MongoDB. For this, we’ll create a service called &lt;code&gt;NoteImportService&lt;/code&gt; inside a new &lt;code&gt;service&lt;/code&gt; package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.config.NotesFolderProperties&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.model.Note&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.repository.NoteRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.util.MarkdownNoteParser&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.io.IOException&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.nio.file.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Arrays&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Objects&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;NoteImportService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;NotesFolderProperties&lt;/span&gt; &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MarkdownNoteParser&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;NoteRepository&lt;/span&gt; &lt;span class="n"&gt;repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;NoteImportService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;NotesFolderProperties&lt;/span&gt; &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;MarkdownNoteParser&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;NoteRepository&lt;/span&gt; &lt;span class="n"&gt;repository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;props&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;repository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;importNotes&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Files&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;walk&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Paths&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getPath&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;endsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".md"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Files&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;Note&lt;/span&gt; &lt;span class="n"&gt;newNote&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="n"&gt;repository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ifPresentOrElse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                        &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;changed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                                &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nc"&gt;Objects&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getContent&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getContent&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
                                        &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTags&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTags&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
                                        &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nc"&gt;Objects&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCreatedAt&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCreatedAt&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;changed&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                            &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setContent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getContent&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                            &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setTags&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTags&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                            &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setCreatedAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCreatedAt&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                            &lt;span class="n"&gt;repository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Updated: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Skipped (no changes): "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
                        &lt;span class="o"&gt;}&lt;/span&gt;  
                    &lt;span class="o"&gt;},&lt;/span&gt; &lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
                        &lt;span class="n"&gt;repository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
                        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Inserted: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;newNote&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTitle&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
                    &lt;span class="o"&gt;});&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This service walks through your vault, finds every &lt;code&gt;.md&lt;/code&gt; file, and parses it into a &lt;code&gt;Note&lt;/code&gt; object. It then checks MongoDB for an existing note with the same title. The repository method &lt;code&gt;findByTitle(...)&lt;/code&gt; looks for a document with the same title.&lt;/p&gt;

&lt;p&gt;If no note with that title exists, the note is saved as a new document. If a note is found, the service compares three fields to detect changes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;content&lt;/code&gt;: Compared using &lt;code&gt;Objects.equals(...)&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tags&lt;/code&gt;: Compared using &lt;code&gt;Arrays.equals(...)&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;createdAt&lt;/code&gt;: Compared using &lt;code&gt;Objects.equals(...)&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any of these fields differ, the existing note is updated with the new values. Otherwise, the write is skipped to avoid unnecessary database operations. This approach is redimentary, but ensures we can run the import multiple times without creating duplicates or re-writing unchanged notes.&lt;/p&gt;

&lt;p&gt;The duplicate detection here is very basic. It assumes that titles are unique across all notes. While this is fine for smaller directories (as in this example), larger vaults may benefit from optimizations such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reducing the number of file checks by caching file modification times, or manually calling the files to upload.
&lt;/li&gt;
&lt;li&gt;Minimizing database queries by batch operations or indexing our regularly queried fields like &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;createdAt&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Search our database with MongoDB Atlas Search
&lt;/h2&gt;

&lt;p&gt;We have our documents in our database, modelled as we wish them to be. Now, it is time to search using the &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/aggregation-stages/search/??utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;&lt;code&gt;$search&lt;/code&gt;&lt;/a&gt; operator. This will allow us to do our &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/tutorial/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;full-text search&lt;/a&gt;, &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/autocomplete/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;autocomplete&lt;/a&gt;, &lt;a href="https://www.mongodb.com/resources/basics/fuzzy-match?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;fuzzy matching&lt;/a&gt;, and &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/score/modify-score/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;custom boosting&lt;/a&gt;. I'll introduce them one after the other so we can have a better idea of how they all work individually, and then wrap it up with all of them in the one search query.&lt;/p&gt;

&lt;h3&gt;
  
  
  Our Atlas Search index
&lt;/h3&gt;

&lt;p&gt;Before we can perform full-text queries, we need to configure an &lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/index-definitions/??utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Atlas Search index&lt;/a&gt; for our collection. In our MongoDB Atlas cluster, go to the collection where you will be storing your documents and select the search index tab:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsbizasblqjtv48tkfmx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsbizasblqjtv48tkfmx.png" alt="MongoDB Atlas UI displaying obsidian notes collection" width="800" height="303"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We need to create a new index for the &lt;code&gt;notes&lt;/code&gt; collection in a database called &lt;code&gt;obsidian&lt;/code&gt; with the following JSON configuration. Since we haven’t run our application yet, we need to manually &lt;a href="https://www.mongodb.com/docs/manual/core/databases-and-collections/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;create the &lt;code&gt;notes&lt;/code&gt; collection and the &lt;code&gt;obsidian&lt;/code&gt; database&lt;/a&gt;. Then, we can add our search index like below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mappings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"dynamic"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"fields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"autocomplete"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The configuration above enables full-text search on all three fields, and we’ve added an &lt;code&gt;autocomplete&lt;/code&gt; mapping on the &lt;code&gt;title&lt;/code&gt; field. This gives us the foundation for type-ahead search later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Simple search
&lt;/h3&gt;

&lt;p&gt;We’ll start with a simple search method that queries across all three fields (&lt;code&gt;title&lt;/code&gt;, &lt;code&gt;content&lt;/code&gt;, and &lt;code&gt;tags&lt;/code&gt;) using the &lt;code&gt;$search&lt;/code&gt; operator. We’ll build from here as we add more advanced search capabilities. In the &lt;code&gt;repository&lt;/code&gt; package, create a &lt;code&gt;NoteSearchRepository&lt;/code&gt; class like below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.model.Note&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.beans.factory.annotation.Autowired&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.domain.Sort&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.MongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.aggregation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Repository&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;NoteSearchRepository&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoTemplate&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@Autowired&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;NoteSearchRepository&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoTemplate&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mongoTemplate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;searchStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$search"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"index"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"tags"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;aggregation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;searchStage&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"notes"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// we'll add more search methods here...&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is our most basic search function. We use &lt;code&gt;MongoTemplate&lt;/code&gt; to run an aggregation pipeline that includes a &lt;code&gt;$search&lt;/code&gt; stage. The &lt;code&gt;text&lt;/code&gt; operator tells MongoDB Atlas Search to look for the query string across the &lt;code&gt;title&lt;/code&gt;, &lt;code&gt;content&lt;/code&gt;, and &lt;code&gt;tags&lt;/code&gt; fields. The results are then mapped directly into our &lt;code&gt;Note&lt;/code&gt; model, giving us strongly typed objects ready for use in the rest of the application.&lt;/p&gt;

&lt;p&gt;Next, we expose a REST endpoint so our frontend (or even just &lt;code&gt;curl&lt;/code&gt;) can hit &lt;code&gt;/api/notes/search?q=...&lt;/code&gt; and get matching results. Create a &lt;code&gt;controller&lt;/code&gt; package and add a new class, &lt;code&gt;NoteSearchController&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.controller&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.model.Note&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.timkelly.springsearch.repository.NoteSearchRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.GetMapping&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.RequestMapping&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.RequestParam&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.RestController&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@RestController&lt;/span&gt;  
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/api/notes"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;NoteSearchController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;NoteSearchRepository&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;NoteSearchController&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;NoteSearchRepository&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;searchRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/search"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// we'll add more endpoints here&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we’ve wired up a simple &lt;code&gt;/api/notes/search&lt;/code&gt; endpoint that takes a &lt;code&gt;q&lt;/code&gt; parameter and passes it along to our repository. The controller returns a list of &lt;code&gt;Note&lt;/code&gt; objects as JSON, making it easy for any frontend (or even a quick &lt;code&gt;curl&lt;/code&gt; request) to query our indexed notes and get instant results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Search boosted by title
&lt;/h3&gt;

&lt;p&gt;By default, MongoDB Atlas Search treats all fields equally. But often, a match in the &lt;code&gt;title&lt;/code&gt; should matter more than one in the body text. We can achieve this by using the &lt;code&gt;compound&lt;/code&gt; operator and applying a &lt;code&gt;boost&lt;/code&gt; to the score of title matches. We need to add the following to our &lt;code&gt;NoteSearchRepository&lt;/code&gt;, below the earlier code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchBoostedByTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;searchStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$search"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"index"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"compound"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"should"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boost"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)))),&lt;/span&gt; &lt;span class="c1"&gt;// Boost title matches&lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="o"&gt;)),&lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"tags"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                        &lt;span class="o"&gt;)))&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;sortStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;by&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Order&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;)));&lt;/span&gt;

        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;agg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;searchStage&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sortStage&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"notes"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And in our &lt;code&gt;NoteSearchController&lt;/code&gt;, we wire this up in our new endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/search/boost-title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchBoostedByTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;searchBoostedByTitle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Search boosted by title and tags
&lt;/h3&gt;

&lt;p&gt;We can take this further by boosting both the &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;tags&lt;/code&gt; fields. Here, the title gets the highest weight, tags are slightly less important, and content is left at default. In our &lt;code&gt;NoteSearchRepository&lt;/code&gt;, add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchBoostedTitleAndTags&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;searchStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$search"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"index"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"compound"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"should"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boost"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)))),&lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"tags"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boost"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)))),&lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                        &lt;span class="o"&gt;)))&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;agg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;searchStage&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agg&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"notes"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the controller endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/search/boost-title-tags"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchBoostedTitleAndTags&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;searchBoostedTitleAndTags&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Search with autocomplete and fuzzy matching
&lt;/h3&gt;

&lt;p&gt;The boosting lets us prioritise where the words we are searching for appear in our documents, but we're not quite there yet. Searching for exact matches has its limits. Let's bring in autocomplete and fuzzy search to round out our searching experience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/docs/atlas/atlas-search/autocomplete/??utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Autocomplete&lt;/a&gt;, sometimes called type ahead, lets us complete our search queries without the complete input string. We can use the &lt;code&gt;autocomplete&lt;/code&gt; operator if we want our search bar to have a search-as-you-type component to predict words with increasing accuracy as characters are entered in our application's search field.&lt;/p&gt;

&lt;p&gt;Fuzzy search brings in typo tolerance to our search mechanics. This way, even if we misspell our query, say &lt;code&gt;Sprong AI&lt;/code&gt;, the search term will still try to use inexact matches.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchAutocompleteAndFuzzy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;searchStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$search"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"index"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"compound"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"should"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"autocomplete"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                &lt;span class="o"&gt;),&lt;/span&gt;  
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"fuzzy"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"maxEdits"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                                &lt;span class="o"&gt;)&lt;/span&gt;  
                        &lt;span class="o"&gt;)))&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;limitStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;aggregation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;searchStage&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limitStage&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"notes"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the controller endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/search/autocomplete-fuzzy"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchAutocompleteAndFuzzy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;searchAutocompleteAndFuzzy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Search with autocomplete, fuzzy matching, and boosting
&lt;/h3&gt;

&lt;p&gt;So we've brought them in step by step, but what does this all look like together? In our &lt;code&gt;NoteSearchRepository&lt;/code&gt;, we need to add a &lt;code&gt;searchAutocompleteFuzzyBoosted(String query)&lt;/code&gt; like in the code below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchAutocompleteFuzzyBoosted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;searchStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$search"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"index"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"default"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"compound"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"should"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                                &lt;span class="c1"&gt;// Autocomplete on title (highest boost)  &lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"autocomplete"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boost"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
                                &lt;span class="o"&gt;),&lt;/span&gt;
                                &lt;span class="c1"&gt;// Fuzzy text search on title&lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"fuzzy"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"maxEdits"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boost"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
                                &lt;span class="o"&gt;),&lt;/span&gt;  
                                &lt;span class="c1"&gt;// Fuzzy text search on tags  &lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"tags"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"fuzzy"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"maxEdits"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boost"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
                                &lt;span class="o"&gt;),&lt;/span&gt;  
                                &lt;span class="c1"&gt;// Fuzzy text search on content (no boost = default weight)  &lt;/span&gt;
                                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"query"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"path"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"fuzzy"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"maxEdits"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                                &lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;)))&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;sortByScore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;by&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Order&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"score"&lt;/span&gt;&lt;span class="o"&gt;)));&lt;/span&gt;
        &lt;span class="nc"&gt;AggregationOperation&lt;/span&gt; &lt;span class="n"&gt;limitStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;aggregation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;searchStage&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sortByScore&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limitStage&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"notes"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And our last bit of code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/search/autocomplete-fuzzy-boosted"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Note&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;searchAutocompleteFuzzyBoosted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;searchRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;searchAutocompleteFuzzyBoosted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing our search
&lt;/h2&gt;

&lt;p&gt;With all of our endpoints in place, it’s time to see them in action. Start your Spring Boot app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn clean compile
mvn spring-boot:run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can now test each search feature from your browser or using &lt;code&gt;curl&lt;/code&gt;. For example:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic full-text search:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8080/api/notes/search?q=Spring"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Boosted by title:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8080/api/notes/search/boost-title?q=Spring"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Boosted by title and tags:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8080/api/notes/search/boost-title-tags?q=Spring"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Autocomplete and fuzzy search (try misspelling the query):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8080/api/notes/search/autocomplete-fuzzy?q=Sprong"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;All-in-one (autocomplete + fuzzy + boosting):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8080/api/notes/search/autocomplete-fuzzy-boosted?q=Sprong"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you hit these endpoints from your browser (e.g., &lt;code&gt;http://localhost:8080/api/notes/search?q=Spring Boot&lt;/code&gt;), you’ll see the raw JSON response, but this is exactly what a frontend search bar would consume.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;What began as a neglected pile of half-thoughts is now a structured, searchable knowledge base, thanks to MongoDB Atlas Search and Spring Boot. We built a REST API that can handle everything from simple keyword queries to fuzzy, autocomplete-driven searches with custom boosting, making my notes far more discoverable.&lt;/p&gt;

&lt;p&gt;By combining Atlas Search’s operators with Spring Boot’s simplicity, I’ve turned my Obsidian vault into a living wiki I can share online, without the manual hassle of a CMS. If you’ve ever wanted to make your notes or knowledge base truly searchable, this approach is simple to replicate.&lt;/p&gt;

&lt;p&gt;If you found this useful, check out my other tutorials on MongoDB with Spring, like &lt;a href="https://dev.to/mongodb/how-to-build-rag-applications-with-spring-ai-and-mongodb-5gaj"&gt;Spring AI and MongoDB: How to Build RAG Applications&lt;/a&gt; or &lt;a href="https://dev.to/mongodb/building-a-real-time-market-data-aggregator-with-kafka-and-mongodb-27ag"&gt;Building a Real-Time Market Data Aggregator with Kafka and MongoDB&lt;/a&gt;. If you want to learn more about Search with MongoDB, check out this short &lt;a href="https://learn.mongodb.com/learn/course/search-fundamentals-on-demand-devrel-content?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=obsidian+search&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;skills course&lt;/a&gt; which covers everything you need to know.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>spring</category>
      <category>java</category>
      <category>search</category>
    </item>
    <item>
      <title>Secure Local RAG with Role-Based Access: Spring AI, Ollama &amp; MongoDB</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Tue, 24 Jun 2025 11:30:00 +0000</pubDate>
      <link>https://forem.com/mongodb/secure-local-rag-with-role-based-access-spring-ai-ollama-mongodb-1gj</link>
      <guid>https://forem.com/mongodb/secure-local-rag-with-role-based-access-spring-ai-ollama-mongodb-1gj</guid>
      <description>&lt;p&gt;Retrieval-augmented generation (&lt;a href="https://www.mongodb.com/resources/basics/artificial-intelligence/retrieval-augmented-generation/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=springai+local+rag&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;RAG&lt;/a&gt;) gives our language models the power to fetch more relevant, timely, and informed responses. But here's the catch: What if you're not thrilled about shipping your highly sensitive company information off to some remote server farm in a distant country, just relying on a "promise" it won’t be used to train someone else's LLM?&lt;/p&gt;

&lt;p&gt;Or maybe your organization is a bit more complex. Your legal team shouldn’t have access to the sales team's confidential data, and that overly curious intern definitely shouldn't be poking around executive-level information. In cases like these, you need RAG—but you need it secure, precise, and completely under your control.&lt;/p&gt;

&lt;p&gt;In this tutorial, we’ll build a secure, locally-managed RAG solution that incorporates user-based access control. If you just want the code, check out the &lt;a href="https://github.com/mongodb-developer/securerag" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before we get started, there are a few things we'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MongoDB Atlas account with a cluster set up (M0 free tier is perfect)

&lt;ul&gt;
&lt;li&gt;If you need help getting started, check out our &lt;a href="https://www.mongodb.com/docs/guides/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=springai+local+rag&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;simple guides&lt;/a&gt;.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://www.docker.com/get-started/" rel="noopener noreferrer"&gt;Docker&lt;/a&gt; installed
&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://openjdk.org/" rel="noopener noreferrer"&gt;Java&lt;/a&gt; 17+ (I use 21)
&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://maven.apache.org/download.cgi" rel="noopener noreferrer"&gt;Maven&lt;/a&gt; 3.9.6+&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is Ollama?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; is an open-source tool that allows you to run LLMs locally on your machine. By running models locally, we can maintain full data ownership and avoid the potential security risks associated with cloud storage. Offline AI tools like Ollama are not only faster and more reliable, but also come with no token costs and no usage fees (basically free).&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up Ollama in Docker
&lt;/h2&gt;

&lt;p&gt;Before we can actually run anything, we need to get Ollama up and running locally. This is what will power both our embedding model and chat model. We’ll keep things fully local so none of our documents ever leave our machine.&lt;/p&gt;

&lt;p&gt;First, create a simple &lt;code&gt;docker-compose.yml&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: '3.8'  

services:  
  ollama:  
    image: ollama/ollama  
    container_name: ollama  
    ports:  
      - "11434:11434"  
    volumes:  
      - ollama_data:/root/.ollama  
    restart: unless-stopped  
    environment:  
      - OLLAMA_HOST=0.0.0.0  

volumes:  
  ollama_data:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will spin up Ollama inside Docker and map port &lt;code&gt;11434&lt;/code&gt; to your local machine, which is what Spring AI will connect to later.&lt;/p&gt;

&lt;p&gt;Now, let’s start everything up and pull the models we’ll be using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With Ollama running, pull down the models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull the chat model&lt;/span&gt;
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; ollama ollama pull llama3.2

&lt;span class="c"&gt;# Pull the embedding model&lt;/span&gt;
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; ollama ollama pull nomic-embed-text

&lt;span class="c"&gt;# Verify models are installed&lt;/span&gt;
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; ollama ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s it. We’re fully set up and ready to start wiring Ollama into Spring AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting our app
&lt;/h2&gt;

&lt;p&gt;Before we start actually coding, we need to set up our dependencies and configure our &lt;code&gt;application.properties&lt;/code&gt;. We'll build a simple little API to interact with our application, and while plenty of embedding and language models are out there, we'll stick with &lt;code&gt;nomic-embed-text&lt;/code&gt; for embeddings and &lt;code&gt;llama3.2&lt;/code&gt; as our chat model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependencies
&lt;/h3&gt;

&lt;p&gt;To set up our project, the easiest approach is to use &lt;a href="https://start.spring.io/" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt;. We will be selecting Spring Boot version 3.5.0 or later, Maven as our build tool, Java as the language, and setting the Java version to 21 with JAR packaging.&lt;/p&gt;

&lt;p&gt;For dependencies, we'll need Spring Web to handle our REST API, MongoDB Atlas Vector Search for efficient retrieval, and Spring AI Ollama for integration with our embedding and language models.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxuwpx3r0uyeamkwrwzlj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxuwpx3r0uyeamkwrwzlj.png" alt="Initializr set up with dependencies" width="800" height="424"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After all that, we can generate our application, and open it in an IDE of our liking. Double check you have the necessary dependencies. You should see something like this in your &lt;code&gt;pom.xml&lt;/code&gt; once Spring Initializr generates your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependencies&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-web&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.ai&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-ai-advisors-vector-store&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.ai&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-ai-starter-model-ollama&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.ai&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-ai-starter-vector-store-mongodb-atlas&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;

    &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-test&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;scope&amp;gt;&lt;/span&gt;test&lt;span class="nt"&gt;&amp;lt;/scope&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dependencies&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;dependencyManagement&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;dependencies&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.ai&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-ai-bom&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;${spring-ai.version}&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;type&amp;gt;&lt;/span&gt;pom&lt;span class="nt"&gt;&amp;lt;/type&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;scope&amp;gt;&lt;/span&gt;import&lt;span class="nt"&gt;&amp;lt;/scope&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/dependencies&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dependencyManagement&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  application.properties
&lt;/h3&gt;

&lt;p&gt;Now, let’s configure our &lt;code&gt;application.properties&lt;/code&gt;. This is where we wire everything together—MongoDB, Ollama, and a bit of Spring AI logging so we can peek under the hood while we build.&lt;/p&gt;

&lt;p&gt;First, the basic Spring and MongoDB config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.application.name=securerag

spring.data.mongodb.uri=${MONGODB_URI}
spring.data.mongodb.database=rag
server.port=8000

logging.level.org.springframework.ai=DEBUG
logging.level.org.springframework.data.mongodb=DEBUG
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we configure MongoDB Vector Search. This is what allows Spring AI to store and query our document embeddings inside MongoDB Atlas:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.ai.vectorstore.mongodb.collection-name=vector_store
spring.ai.vectorstore.mongodb.initialize-schema=true
spring.ai.vectorstore.mongodb.metadata-fields-to-filter=department,access_level,roles
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we’re telling Spring AI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which collection to use for storing our vector embeddings (&lt;code&gt;vector_store&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;Which metadata fields we’ll later want to use for filtering search results (&lt;code&gt;department&lt;/code&gt;, &lt;code&gt;access_level&lt;/code&gt;, and &lt;code&gt;roles&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;Whether or not to initialize the schema (&lt;code&gt;initialize-schema=true&lt;/code&gt;). This simply means Spring AI will create the collection and index automatically if it doesn’t already exist, so we’re ready to run semantic search right out of the gate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And finally, we wire up Ollama. This points Spring AI to the local Ollama instance we’ll be running, and tells it which models to use for embedding and chat:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.embedding.model=nomic-embed-text
spring.ai.ollama.chat.options.model=llama3.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since different models excel at different tasks, we’re using &lt;code&gt;nomic-embed-text&lt;/code&gt; for generating our embeddings (this helps turn documents into vector representations), and &lt;code&gt;llama3.2&lt;/code&gt; for the actual chat responses. Using separate models here gives us better results on both sides—retrieval and generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The business logic
&lt;/h2&gt;

&lt;p&gt;Now that we’ve got everything wired up, let’s look at how the actual chat flow works behind the scenes. This is where we combine retrieval, filtering, and our chat model into one secure conversation pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  The &lt;code&gt;ChatService&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;We’ll start by building a simple Spring &lt;code&gt;@Service&lt;/code&gt; to contain our main logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.securerag&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.chat.client.ChatClient&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.chat.client.advisor.vectorstore.QuestionAnswerAdvisor&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.chat.model.ChatModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.vectorstore.SearchRequest&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.vectorstore.VectorStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ChatModel&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;VectorStore&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nc"&gt;ChatService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChatModel&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;VectorStore&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;vectorStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// We'll add our business methods here next&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spring will automatically inject the &lt;code&gt;ChatModel&lt;/code&gt; (which talks to Ollama) and the &lt;code&gt;VectorStore&lt;/code&gt; (which talks to &lt;a href="https://www.mongodb.com/products/platform/atlas-vector-search/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=springai+local+rag&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas Vector Search&lt;/a&gt;). This allows us to easily wire together search and chat in a clean service layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sending secure messages
&lt;/h3&gt;

&lt;p&gt;Now, let’s build the core method that handles chat requests, performs semantic search, and applies access control at query time. We’ll add the following to the &lt;code&gt;ChatService&lt;/code&gt; class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;sendSecureMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="c1"&gt;// Build the base QuestionAnswerAdvisor — this wires up vector search for retrieval&lt;/span&gt;
    &lt;span class="nc"&gt;QuestionAnswerAdvisor&lt;/span&gt; &lt;span class="n"&gt;advisor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;QuestionAnswerAdvisor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;searchRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SearchRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;topK&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;// Fetch top 5 most relevant documents&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="c1"&gt;// Build the ChatClient using the advisor for retrieval-augmented context&lt;/span&gt;
    &lt;span class="nc"&gt;ChatClient&lt;/span&gt; &lt;span class="n"&gt;filteredChatClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;defaultAdvisors&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;advisor&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="c1"&gt;// Dynamically build the filtering expression based on user role &amp;amp; department&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;filterExpression&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;createAccessFilterExpression&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="c1"&gt;// Attach the filter expression to this specific chat request&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;filteredChatClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;advisors&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;param&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;QuestionAnswerAdvisor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;FILTER_EXPRESSION&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filterExpression&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;call&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few key things happen here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We build a &lt;code&gt;QuestionAnswerAdvisor&lt;/code&gt; that handles the vector search piece.
&lt;/li&gt;
&lt;li&gt;We dynamically generate our security filter per request, based on who’s asking.
&lt;/li&gt;
&lt;li&gt;Spring AI lets us inject this filter expression into each call to enforce access control at retrieval time.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Building the filter expression
&lt;/h3&gt;

&lt;p&gt;Let’s generate that dynamic filter that decides which documents each user can see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;createAccessFilterExpression&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;conditions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  

    &lt;span class="c1"&gt;// Public documents are always accessible&lt;/span&gt;
    &lt;span class="n"&gt;conditions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"access_level == 'public'"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="c1"&gt;// Add role-based conditions if the user has a specific role&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userRole&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equalsIgnoreCase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;StringBuilder&lt;/span&gt; &lt;span class="n"&gt;roleCondition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StringBuilder&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="n"&gt;roleCondition&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"(roles in ['"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"', 'all_employees'])"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="c1"&gt;// Further restrict by department if provided&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;roleCondition&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;" &amp;amp;&amp;amp; (department == '"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"')"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  

        &lt;span class="n"&gt;conditions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;roleCondition&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="c1"&gt;// Join everything together into one filter string&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;join&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;" || "&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conditions&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Quick summary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If we’re public: We only get public docs.
&lt;/li&gt;
&lt;li&gt;If we have a role: We get both public docs &lt;em&gt;and&lt;/em&gt; documents matching our role (and optionally, department).
&lt;/li&gt;
&lt;li&gt;This gives us fine-grained access control entirely at query time, no additional layers needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Under the hood, Spring AI takes this filter string and parses it using its built-in &lt;a href="https://docs.spring.io/spring-ai/reference/api/retrieval-augmented-generation.html#_dynamic_filter_expressions" rel="noopener noreferrer"&gt;filter expression language&lt;/a&gt;. This syntax isn’t raw MongoDB query language, but rather a simple boolean expression format supported by Spring AI. Operators like &lt;code&gt;==&lt;/code&gt;, &lt;code&gt;in&lt;/code&gt;, &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;, &lt;code&gt;||&lt;/code&gt;, and parentheses are translated by Spring AI into proper MongoDB &lt;code&gt;$vectorSearch&lt;/code&gt; filters when the query is executed. For example, our filter string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;access_level&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;roles&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'hr_manager'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'all_employees'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;'HR'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Would be automatically parsed into the filter expression:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"$or"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"access_level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"$eq"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"public"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"$and"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"roles"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"$in"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"hr_manager"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"all_employees"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"department"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"$eq"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HR"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key is that Spring AI expects the filter string to follow this limited expression grammar. You cannot send full MongoDB queries or arbitrary text. As long as your string follows this format, Spring AI handles parsing and conversion automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating an API
&lt;/h2&gt;

&lt;p&gt;Alright, now that the brains of the system are built, we need a way to actually talk to it. Let’s wire up a super simple REST API so we can hit it with requests and see our secure RAG flow in action.&lt;/p&gt;

&lt;h3&gt;
  
  
  The request payload
&lt;/h3&gt;

&lt;p&gt;We’ll create a plain old Java object to handle chat requests coming into our app. Nothing fancy, just a little DTO that carries the message, role, and department info along for the ride:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.securerag&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatRequest&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ChatRequest&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  &lt;span class="c1"&gt;// default constructor needed for Spring to deserialize  &lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ChatRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;userRole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;department&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getUserRole&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setUserRole&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;userRole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getDepartment&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setDepartment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;department&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Basically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;message&lt;/code&gt;:  what the user is asking
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;userRole&lt;/code&gt;:  who’s asking
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;department&lt;/code&gt;:  which department they're in, if relevant&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Spring will automatically turn incoming JSON into this object for us. Easy.&lt;/p&gt;

&lt;h3&gt;
  
  
  The controller
&lt;/h3&gt;

&lt;p&gt;This is where we expose our endpoints, just a small API surface so we can send chat requests to the back end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.securerag&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@RestController&lt;/span&gt;  
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/chat"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ChatController&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChatService&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/{message}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;sendMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@PathVariable&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                              &lt;span class="nd"&gt;@RequestParam&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;defaultValue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                              &lt;span class="nd"&gt;@RequestParam&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sendSecureMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userRole&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/secure"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;sendSecureMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestBody&lt;/span&gt; &lt;span class="nc"&gt;ChatRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sendSecureMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getUserRole&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDepartment&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s what’s going on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;GET&lt;/code&gt; endpoint makes it easy to test quickly—just drop a URL in your browser or hit it with &lt;code&gt;curl&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;POST&lt;/code&gt; endpoint gives us a bit more flexibility—we can send in the full request body with role and department data.
&lt;/li&gt;
&lt;li&gt;Both routes hand off to our &lt;code&gt;ChatService&lt;/code&gt;, which handles the filtering logic and secure retrieval behind the scenes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Perfect, here’s a little insert you can drop right after you explain the controller:&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick note on GET vs POST
&lt;/h3&gt;

&lt;p&gt;You might be wondering: Why do we have both a &lt;code&gt;GET&lt;/code&gt; and a &lt;code&gt;POST&lt;/code&gt; endpoint for this?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;GET&lt;/code&gt; endpoint is mainly here for convenience. It makes testing easier when you're still building. You can fire off quick curl commands or hit it directly from your browser to sanity check things.
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;POST&lt;/code&gt; endpoint is better suited for real usage. It allows us to send the full chat request as JSON, which is much cleaner if you're dealing with longer queries, special characters, or calling the API from an actual front end or another system. It also gives us room to grow later if we want to pass in more fields.
&lt;/li&gt;
&lt;li&gt;Technically speaking, since this isn't a simple "read" operation (we're doing retrieval, filtering, embedding comparisons, and inference), many would argue this should be &lt;code&gt;POST&lt;/code&gt;-only once you go into production.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For now, we’ll keep both while we’re building, but if you were turning this into a real product, you’d likely just expose the &lt;code&gt;POST&lt;/code&gt; endpoint.&lt;/p&gt;

&lt;p&gt;And that’s it. The API layer is done. You’ve now got a secure RAG system you can hit with real HTTP requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Load some sample data
&lt;/h2&gt;

&lt;p&gt;Before we can test anything, we need some documents loaded into our vector store—otherwise, our model will just sit there politely answering nothing.&lt;/p&gt;

&lt;p&gt;We’ll load up some simple HR, Finance, Sales, and Executive documents, each tagged with different access levels, roles, and departments. This will give us enough data to see our filtering in action.&lt;/p&gt;

&lt;p&gt;Here’s our data loader:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.securerag&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.document.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.vectorstore.VectorStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.CommandLineRunner&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Component&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Component&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DocumentLoader&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;CommandLineRunner&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;VectorStore&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;DocumentLoader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;VectorStore&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;vectorStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Override&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;loadSampleDocuments&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;loadSampleDocuments&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;

        &lt;span class="c1"&gt;// HR Documents&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Employee Handbook - Vacation Policy: All full-time employees are entitled to 15 days of paid vacation per year."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"HR"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"confidential"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hr_manager"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hr_coordinator"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Vacation Policy"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Salary Review Process: Annual salary reviews are conducted in Q4 of each year."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"HR"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"restricted"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hr_manager"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"executive"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Salary Review Process"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="c1"&gt;// Finance Documents&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Q3 Budget Report: Total revenue for Q3 was $2.4M."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Finance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"confidential"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"finance_manager"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cfo"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"executive"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Q3 Budget Report"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Expense Policy: All business expenses over $100 require manager approval."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Finance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"all_employees"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Expense Policy"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="c1"&gt;// Public Company Information&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Company Mission Statement: Our mission is to democratize access to artificial intelligence."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"General"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"all_employees"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Company Mission Statement"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Office Hours and Location: Our main office is located at 123 Tech Street, San Francisco, CA."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"General"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"all_employees"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"public"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Office Information"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="c1"&gt;// Sales Documents&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Q4 Sales Targets: Individual sales targets for Q4 are set at $500K per sales representative."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Sales"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"confidential"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"sales_rep"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"sales_manager"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Q4 Sales Targets"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="c1"&gt;// Executive Documents&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"Strategic Planning 2025: Key initiatives include international expansion and AI model optimization."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="s"&gt;"department"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Executive"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"access_level"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"restricted"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ceo"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cto"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cfo"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"executive"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                        &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Strategic Planning 2025"&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Loading "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" sample documents..."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Sample documents loaded successfully!"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We inject the &lt;code&gt;VectorStore&lt;/code&gt; directly into our loader, and on application startup (using &lt;code&gt;CommandLineRunner&lt;/code&gt;), it loads all the documents automatically. Each document contains the main text plus metadata fields: &lt;code&gt;access_level&lt;/code&gt;, &lt;code&gt;roles&lt;/code&gt;, &lt;code&gt;department&lt;/code&gt;, and &lt;code&gt;title&lt;/code&gt;.  This metadata structure maps directly to the filters we wrote earlier in &lt;code&gt;createAccessFilterExpression()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This metadata is what we’ll later use to apply filtering logic during search. Public documents will always be visible to any user, while the restricted documents will only appear if the user has the appropriate role and department permissions.&lt;/p&gt;

&lt;p&gt;That’s it! Our app will automatically load these into MongoDB Atlas when we start it up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing it out
&lt;/h2&gt;

&lt;p&gt;Now, let’s actually give this thing a spin. We can run it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MONGODB_URI&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"YOUR_CONNECTION_STRING&amp;gt;"&lt;/span&gt;

mvn spring-boot:run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With our app running locally, we can hit the API directly using &lt;code&gt;curl&lt;/code&gt; and see our secure retrieval logic in action. First, make sure to give it a moment to insert our sample data and create our embeddings.&lt;/p&gt;

&lt;p&gt;Let’s try a simple public query where no role is provided. Anyone should be able to access this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8000/chat/What%20is%20our%20mission?userRole=public"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And voila!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Based on the provided context, the company's mission statement is:

"Our mission is to democratize access to artificial intelligence by building tools that make AI accessible to businesses of all sizes. We believe in ethical AI development and transparent business practices."

This appears to be a repeated statement, but it provides the core of the company's mission and values.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We get back our mission statement, as expected, pulled from the public data. The model also surfaces a bit of related context (office hours, expense policy, etc.), since it's retrieving semantically similar documents.&lt;/p&gt;

&lt;p&gt;Next, let’s simulate an HR manager querying for vacation policy. This time, we pass both the role and department to unlock access to more restricted content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8000/chat/vacation%20policy?userRole=hr_manager&amp;amp;department=HR"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we get back a fairly well informed response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Based on the provided context, I see that there are multiple policies mentioned:

1. Vacation Policy: 15 days of paid vacation per year, with a maximum carryover of 5 unused days to the next year.

2. Expense Policy: All expenses over $100 require manager approval and must be submitted within 30 days of the expense date.

3. Office Hours and Location: The main office is located at 123 Tech Street, San Francisco, CA, with office hours from Monday to Friday, 9 AM to 6 PM Pacific Time.

If you could provide more context or clarify your question, I\'d be happy to try and assist further!     
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we’re seeing both the vacation policy and other relevant internal documents that the HR manager is allowed to see. The access filter is working exactly as intended.&lt;/p&gt;

&lt;p&gt;Finally, let’s check what happens if a public user asks the same vacation policy question:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"http://localhost:8000/chat/vacation%20policy?userRole=public"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As expected, the system correctly returns no private HR data, and limits the response to public information only.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I don't see any user comments or questions regarding a "vacation policy". The provided text only includes information about an expense policy, office hours and location, and company mission statement. Could you please provide more context or clarify what you would like to know about a vacation policy? I'll do my best to help.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since this user isn’t authorized for confidential HR documents, the system correctly avoids leaking private data. Instead, the model only references public information and doesn’t return any HR policy details.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;And we’ve built a fully functional, secure, locally-managed RAG system using Spring AI, Ollama, and MongoDB Atlas. The key here wasn’t just getting RAG working (plenty of tutorials stop there), but actually layering access control directly into the retrieval step itself. This way, we’re not handing the language model private data it was never supposed to see in the first place. We also don’t have to pay for token usage, so we aren’t paying for each request.&lt;/p&gt;

&lt;p&gt;With this foundation, you can easily extend things further: add more roles, more metadata, more complex filters, richer documents, or swap in different models depending on your use case. Most importantly, you're in full control. No third-party model provider ever sees your sensitive documents, but you still get the power of retrieval-augmented LLM responses for your users.&lt;/p&gt;

&lt;p&gt;Check out some of my other tutorials, like &lt;a href="https://dev.to/mongodb/building-a-real-time-ai-fraud-detection-system-with-spring-kafka-and-mongodb-2jbn"&gt;Building a Real-Time AI Fraud Detection System with Spring Kafka and MongoDB&lt;/a&gt; or &lt;a href="https://medium.com/mongodb/your-guide-to-optimizing-slow-queries-f69cf413ee71" rel="noopener noreferrer"&gt;Your Guide to Optimizing Slow Queries&lt;/a&gt;, and learn more about what you can do with MongoDB and Java. &lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>rag</category>
      <category>spring</category>
      <category>security</category>
    </item>
    <item>
      <title>Understanding BSON for Java Developers: A Beginner’s Guide to MongoDB’s Data Format</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Thu, 12 Jun 2025 08:47:33 +0000</pubDate>
      <link>https://forem.com/mongodb/understanding-bson-for-java-developers-a-beginners-guide-to-mongodbs-data-format-3ddp</link>
      <guid>https://forem.com/mongodb/understanding-bson-for-java-developers-a-beginners-guide-to-mongodbs-data-format-3ddp</guid>
      <description>&lt;p&gt;When working with MongoDB, it’s easy to think you’re dealing with JSON. After all, the queries, documents, and API responses all look like JSON. But MongoDB is not storing JSON. It’s storing BSON—a binary format designed for efficient storage and fast traversal.&lt;/p&gt;

&lt;p&gt;BSON (Binary JSON) is more than just a binary version of JSON. It introduces additional data types like &lt;code&gt;ObjectId&lt;/code&gt;, &lt;code&gt;Decimal128&lt;/code&gt;, and &lt;code&gt;Timestamp&lt;/code&gt;, allowing MongoDB to handle more complex data structures and ensure data integrity. While we might rarely interact with raw BSON directly, understanding how MongoDB stores and processes BSON documents can help us write more efficient queries, handle data conversions properly, and debug unexpected behavior.&lt;/p&gt;

&lt;p&gt;In this guide, we’ll take a look at some of BSON’s key concepts, how it maps to Java types, and how we can work with BSON in the &lt;a href="https://www.mongodb.com/docs/drivers/java/sync/current/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Java driver&lt;/a&gt;. We’ll look at creating basic documents, working with nested structures, handling raw BSON, querying with BSON, and finally, mapping Java objects to BSON using POJOs.&lt;/p&gt;

&lt;p&gt;By the end, you’ll have a clear understanding of how MongoDB leverages BSON under the hood and how you can work with it effectively in Java. Let’s get started. If you just want to check out the code, pop over to the &lt;a href="https://github.com/mongodb-developer/mongodbbsonexample" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is BSON?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/resources/languages/bson#:~:text=What%20Does%20BSON%20Stand%20For,like%20dates%20and%20binary%20data.?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;BSON&lt;/a&gt; stands for Binary JSON. It’s a binary-encoded serialization of JSON-like documents. It’s everything we like about JSON, just more efficient and type-rich—optimized for speed and storage in MongoDB.&lt;/p&gt;

&lt;p&gt;While we interact with MongoDB using JSON-like queries, the documents themselves are stored and transmitted as BSON behind the scenes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not just JSON?
&lt;/h3&gt;

&lt;p&gt;JSON is pretty great. It's human-readable, flexible, and widely adopted. But it’s not ideal for databases. Here’s why MongoDB uses BSON instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt;: BSON can be parsed faster than JSON.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rich types&lt;/strong&gt;: BSON supports additional data types like &lt;code&gt;Date&lt;/code&gt;, &lt;code&gt;Decimal128&lt;/code&gt;, and &lt;code&gt;ObjectId&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traversability&lt;/strong&gt;: BSON includes length prefixes, making it easier for MongoDB to jump between fields during queries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s an example of what a document &lt;code&gt;{“hello”: “world”}&lt;/code&gt; would look like in BSON, with length prefixes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"hello": "world"} →
\x16\x00\x00\x00           // total document size
\x02                       // 0x02 = type String
hello\x00                  // field name
\x06\x00\x00\x00world\x00  // field value
\x00                       // 0x00 = type EOO ('end of object')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&amp;gt;Note: A BSON document has a &lt;a href="https://www.mongodb.com/docs/manual/reference/limits/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;size limit of 16MB&lt;/a&gt; on MongoDB.&lt;/p&gt;

&lt;h2&gt;
  
  
  BSON vs. JSON
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;JSON&lt;/th&gt;
&lt;th&gt;BSON&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Format&lt;/td&gt;
&lt;td&gt;Text-based&lt;/td&gt;
&lt;td&gt;Binary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Readability&lt;/td&gt;
&lt;td&gt;Human-readable&lt;/td&gt;
&lt;td&gt;Machine-efficient&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data types&lt;/td&gt;
&lt;td&gt;Limited (no dates, binary, etc.)&lt;/td&gt;
&lt;td&gt;Rich and explicit (e.g., &lt;code&gt;ObjectId&lt;/code&gt;, &lt;code&gt;Date&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Slower to parse&lt;/td&gt;
&lt;td&gt;Faster to parse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Size&lt;/td&gt;
&lt;td&gt;Often smaller&lt;/td&gt;
&lt;td&gt;Slightly larger due to type metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Common BSON data types (and their Java equivalents)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;BSON type&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Java equivalent&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;String&lt;/td&gt;
&lt;td&gt;UTF-8 string&lt;/td&gt;
&lt;td&gt;&lt;code&gt;String&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Int32 / Int64&lt;/td&gt;
&lt;td&gt;32-bit / 64-bit integers&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;int&lt;/code&gt;, &lt;code&gt;long&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Double&lt;/td&gt;
&lt;td&gt;64-bit float&lt;/td&gt;
&lt;td&gt;&lt;code&gt;double&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Boolean&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;true&lt;/code&gt; / &lt;code&gt;false&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;&lt;code&gt;boolean&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Date&lt;/td&gt;
&lt;td&gt;Epoch millis&lt;/td&gt;
&lt;td&gt;&lt;code&gt;java.util.Date&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ObjectId&lt;/td&gt;
&lt;td&gt;12-byte unique identifier&lt;/td&gt;
&lt;td&gt;&lt;code&gt;org.bson.types.ObjectId&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binary&lt;/td&gt;
&lt;td&gt;Byte array&lt;/td&gt;
&lt;td&gt;&lt;code&gt;byte[]&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document&lt;/td&gt;
&lt;td&gt;Embedded object&lt;/td&gt;
&lt;td&gt;&lt;code&gt;org.bson.Document&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Array&lt;/td&gt;
&lt;td&gt;List of values&lt;/td&gt;
&lt;td&gt;&lt;code&gt;List&amp;lt;?&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  BSON and MongoDB internals
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;BSON is how MongoDB &lt;strong&gt;stores&lt;/strong&gt; documents on disk.
&lt;/li&gt;
&lt;li&gt;BSON is how MongoDB &lt;strong&gt;communicates&lt;/strong&gt; between client and server.
&lt;/li&gt;
&lt;li&gt;Indexes, metadata, and replication all operate on BSON.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our Java driver handles the BSON encoding and decoding transparently. But if we're building performance-sensitive applications or exploring custom serialization, or we’re even just curious, it's worth understanding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup and project structure
&lt;/h2&gt;

&lt;p&gt;In order to follow along with the code, make sure you have Java 24 and Maven installed. You will also need a MongoDB cluster set up. A &lt;a href="https://www.mongodb.com/docs/atlas/tutorial/deploy-free-tier-cluster/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB M0&lt;/a&gt; free-forever tier is perfect for this.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Project structure:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/src
  └── main
      └── java
          └── com
              └── mongodb
                  ├── Main.java
                  ├── User.java
                  └── Address.java
pom.xml
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Maven dependency (pom.xml):&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;dependencies&amp;gt;
    &amp;lt;dependency&amp;gt;  
        &amp;lt;groupId&amp;gt;org.mongodb&amp;lt;/groupId&amp;gt;  
        &amp;lt;artifactId&amp;gt;mongodb-driver-sync&amp;lt;/artifactId&amp;gt;  
        &amp;lt;version&amp;gt;5.4.0&amp;lt;/version&amp;gt;  
    &amp;lt;/dependency&amp;gt;
&amp;lt;/dependencies&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  BSON data types and document creation
&lt;/h2&gt;

&lt;p&gt;In MongoDB’s Java driver, we can interact directly with BSON types using classes like &lt;code&gt;BsonString&lt;/code&gt;, &lt;code&gt;BsonInt32&lt;/code&gt;, and &lt;code&gt;BsonObjectId&lt;/code&gt;. However, we rarely do this. Instead, we work with standard Java types, and MongoDB automatically handles the conversion to BSON.&lt;/p&gt;

&lt;p&gt;Here’s a look at BSON-specific types:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;BsonString&lt;/span&gt; &lt;span class="n"&gt;bsonString&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello BSON"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;BsonInt32&lt;/span&gt; &lt;span class="n"&gt;bsonInt32&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonInt32&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;BsonInt64&lt;/span&gt; &lt;span class="n"&gt;bsonInt64&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonInt64&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;9876543210L&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;BsonDecimal128&lt;/span&gt; &lt;span class="n"&gt;bsonDecimal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonDecimal128&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Decimal128&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"12345.678"&lt;/span&gt;&lt;span class="o"&gt;)));&lt;/span&gt;
&lt;span class="nc"&gt;BsonDateTime&lt;/span&gt; &lt;span class="n"&gt;bsonDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonDateTime&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getTime&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="nc"&gt;BsonBinary&lt;/span&gt; &lt;span class="n"&gt;bsonBinary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonBinary&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"binary data"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBytes&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="nc"&gt;BsonObjectId&lt;/span&gt; &lt;span class="n"&gt;bsonObjectId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonObjectId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="nc"&gt;BsonTimestamp&lt;/span&gt; &lt;span class="n"&gt;bsonTimestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonTimestamp&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, we work with familiar Java types. The driver converts &lt;code&gt;String&lt;/code&gt;, &lt;code&gt;int&lt;/code&gt;, &lt;code&gt;Date&lt;/code&gt;, and other common types to their BSON equivalents behind the scenes.&lt;/p&gt;

&lt;p&gt;For example, when we call &lt;code&gt;new Date()&lt;/code&gt; in Java, MongoDB stores it as a BSON &lt;code&gt;Date&lt;/code&gt; type, represented as milliseconds since the epoch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"created"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;insertOne&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Internally, MongoDB stores it like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{ "created": { "$date": "2025-05-09T12:34:56.789Z" } 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Understanding this conversion helps when we’re debugging data types or working with MongoDB tools that expose BSON representations.&lt;/p&gt;

&lt;p&gt;When interacting with MongoDB, we typically work with the &lt;code&gt;Document&lt;/code&gt; class rather than BSON-specific types. The &lt;code&gt;Document&lt;/code&gt; class represents a BSON document and allows us to structure data using standard Java types.&lt;/p&gt;

&lt;p&gt;Here’s how we create a basic document:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;createBasicDocument&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n--- Basic Document Creation ---"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"John Doe"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"age"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"isMember"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"joined"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;insertOne&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Inserted Document: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this method, we create a &lt;code&gt;Document&lt;/code&gt; object with standard Java types: a &lt;code&gt;String&lt;/code&gt;, an &lt;code&gt;int&lt;/code&gt;, a &lt;code&gt;boolean&lt;/code&gt;, a &lt;code&gt;Date&lt;/code&gt;, and an &lt;code&gt;ObjectId&lt;/code&gt;. The MongoDB driver automatically converts these to BSON types when the document is inserted into the collection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Inserted Document: {"_id": "64e99b0b4321a1b9a6e4d8c7", "name": "John Doe", "age": 30, "isMember": true, "joined": "2025-05-09T12:34:56.789Z"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ObjectId&lt;/code&gt; is generated automatically, if not provided. The &lt;code&gt;Date&lt;/code&gt; is converted to a BSON &lt;code&gt;Date&lt;/code&gt; type, stored as milliseconds since the epoch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Nested fields and arrays
&lt;/h3&gt;

&lt;p&gt;MongoDB allows for nested structures and arrays, making it easy to represent complex data within a single document. This structure gives us more flexibility in how we want to model our data than traditional relational databases, where data would typically be spread across multiple tables. In MongoDB, we can embed related data directly within the document, creating hierarchical structures that are easy to query and manipulate.&lt;/p&gt;

&lt;p&gt;In Java, we use the &lt;code&gt;Document&lt;/code&gt; class to create nested structures and arrays. Each nested level is represented by another &lt;code&gt;Document&lt;/code&gt;, and arrays are represented by Java &lt;code&gt;List&lt;/code&gt; objects. When we work with nested documents and arrays in MongoDB, each level of nesting is still BSON, not JSON. This matters because MongoDB uses BSON-specific types like &lt;code&gt;Decimal128&lt;/code&gt; and &lt;code&gt;Date&lt;/code&gt;, even in nested structures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;createNestedDocument&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n--- Nested Fields and Arrays ---"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;nestedDoc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Alice"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"balance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Decimal128&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"12345.67"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"contacts"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"123-456-7890"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"987-654-3210"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"address"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"city"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Dublin"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"postalCode"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"D02"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tags"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"premium"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"verified"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"activity"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"login"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;()).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"status"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"active"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;insertOne&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nestedDoc&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Inserted Nested Document: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;nestedDoc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this method, we are creating a document that represents a user with several fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"user"&lt;/code&gt; is a simple string field.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"balance"&lt;/code&gt; is a &lt;code&gt;Decimal128&lt;/code&gt; value, which is used for financial calculations to prevent precision loss.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"contacts"&lt;/code&gt; is a list of strings, representing multiple contact numbers.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"address"&lt;/code&gt; is a nested document containing &lt;code&gt;"city"&lt;/code&gt; and &lt;code&gt;"postalCode"&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"tags"&lt;/code&gt; is an array of strings, useful for categorizing or labeling documents.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"activity"&lt;/code&gt; is another nested document containing a login timestamp and a status field.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the above method is executed, the document is converted to BSON and inserted into the MongoDB collection. The output of the inserted document will look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Inserted Nested Document: {
  "user": "Alice",
  "balance": 12345.67,
  "contacts": ["123-456-7890", "987-654-3210"],
  "address": {
    "city": "Dublin",
    "postalCode": "D02"
  },
  "tags": ["premium", "verified"],
  "activity": {
    "login": "2025-05-09T12:34:56.789Z",
    "status": "active"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The nested structure is straightforward to read and query. Each level of nesting is a separate &lt;code&gt;Document&lt;/code&gt; object, and arrays are automatically converted to BSON arrays.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why use nested structures?
&lt;/h3&gt;

&lt;p&gt;Nested documents provide a way to keep related data together, minimizing the number of queries needed to access complete data sets. Instead of joining tables, we can query a single document to retrieve user details, address information, and recent activity. This approach is particularly useful when dealing with hierarchical data, embedded lists, or object relationships that are tightly coupled.&lt;/p&gt;

&lt;h2&gt;
  
  
  Raw BSON manipulation
&lt;/h2&gt;

&lt;p&gt;So far, we’ve relied on the &lt;code&gt;Document&lt;/code&gt; class to handle BSON conversion. But what if we need direct control over BSON structure? That’s where raw BSON manipulation comes in. MongoDB’s Java driver provides a higher-level &lt;code&gt;Document&lt;/code&gt; class that abstracts away BSON specifics, allowing us to work with Java types like &lt;code&gt;String&lt;/code&gt;, &lt;code&gt;int&lt;/code&gt;, and &lt;code&gt;Date&lt;/code&gt;. However, sometimes, we may need to interact directly with BSON data. This can be useful when dealing with binary data, timestamps, or performing low-level optimizations.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Binary data:&lt;/strong&gt; BSON allows for binary storage (&lt;code&gt;BsonBinary&lt;/code&gt;). This is useful for handling images, files, or encrypted data.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timestamps:&lt;/strong&gt; BSON &lt;code&gt;BsonTimestamp&lt;/code&gt; includes both a time and an increment value, making it useful for tracking operations in oplogs or distributed systems.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;BsonDocument&lt;/code&gt; class provides a more granular way to construct BSON documents using BSON-specific types such as &lt;code&gt;BsonString&lt;/code&gt;, &lt;code&gt;BsonInt32&lt;/code&gt;, and &lt;code&gt;BsonBinary&lt;/code&gt;. Unlike the &lt;code&gt;Document&lt;/code&gt; class, &lt;code&gt;BsonDocument&lt;/code&gt; requires explicit type declarations for each field, making it more verbose but also more explicit.&lt;/p&gt;

&lt;p&gt;The following method constructs a BSON document directly using BSON-specific classes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;demonstrateRawBSON&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n--- Raw BSON Manipulation ---"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;BsonDocument&lt;/span&gt; &lt;span class="n"&gt;rawBson&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonDocument&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Raw BSON Example"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"value"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonInt32&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"binaryData"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonBinary&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"raw data"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBytes&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"timestamp"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonTimestamp&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"array"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BsonArray&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BsonInt32&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; 
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BsonInt32&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; 
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;BsonInt32&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;)));&lt;/span&gt;

    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Raw BSON: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rawBson&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, we create a BSON document using explicit BSON classes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"title"&lt;/code&gt; is a &lt;code&gt;BsonString&lt;/code&gt;, representing a UTF-8 string.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"value"&lt;/code&gt; is a &lt;code&gt;BsonInt32&lt;/code&gt;, a 32-bit integer.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"binaryData"&lt;/code&gt; is a &lt;code&gt;BsonBinary&lt;/code&gt;, representing raw byte data as a base64-encoded string.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"timestamp"&lt;/code&gt; is a &lt;code&gt;BsonTimestamp&lt;/code&gt;, containing both a Unix timestamp and an increment counter.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"array"&lt;/code&gt; is a &lt;code&gt;BsonArray&lt;/code&gt;, holding multiple &lt;code&gt;BsonInt32&lt;/code&gt; values.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each &lt;code&gt;append()&lt;/code&gt; call explicitly defines the BSON type, making this method more verbose than using the &lt;code&gt;Document&lt;/code&gt; class but also more precise.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw BSON: {
  "title": "Raw BSON Example",
  "value": 100,
  "binaryData": {
    "$binary": "cmF3IGRhdGE=",
    "$type": "00"
  },
  "timestamp": {
    "$timestamp": {
      "t": 1650034567,
      "i": 1
    }
  },
  "array": [1, 2, 3]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;binaryData&lt;/code&gt; field is represented as a base64-encoded string with a type identifier (&lt;code&gt;00&lt;/code&gt; for generic binary data). The &lt;code&gt;timestamp&lt;/code&gt; field includes both a time value (&lt;code&gt;t&lt;/code&gt;) and an increment (&lt;code&gt;i&lt;/code&gt;), useful for replication and internal operations.&lt;/p&gt;

&lt;p&gt;Direct BSON manipulation is not typically necessary for most MongoDB operations. For most use cases, the &lt;code&gt;Document&lt;/code&gt; class is sufficient, and a lot more intuitive. The &lt;code&gt;BsonDocument&lt;/code&gt; class is there when we need more precise control over BSON data or when working with advanced MongoDB features like oplog processing or custom serialization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Querying with BSON
&lt;/h2&gt;

&lt;p&gt;When querying MongoDB, we typically use the &lt;a href="https://www.mongodb.com/docs/drivers/java/sync/current/builders/filters/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;&lt;code&gt;Filters&lt;/code&gt; class&lt;/a&gt;.The &lt;code&gt;Filters&lt;/code&gt; class provides static factory methods for all the MongoDB query operators. Each method returns an instance of the BSON type, which we can pass to any method that expects a query filter. These filters work with BSON data but allow us to write queries using Java types and let the MongoDB driver handle the conversion to BSON.&lt;/p&gt;

&lt;p&gt;Let’s take a look at a simple equality filter. The following method demonstrates how to query for specific documents using &lt;code&gt;Filters.eq()&lt;/code&gt; and &lt;code&gt;Filters.exists()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;queryWithBSON&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n--- Querying with BSON ---"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Find the first document where the "user" field is "Alice"&lt;/span&gt;
    &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;find&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;eq&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Alice"&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="na"&gt;first&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Found Document: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Find all documents that contain the "activity" field&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;find&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"activity"&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="na"&gt;into&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Documents with 'activity' field:"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Filters.eq()&lt;/code&gt; method creates a simple equality query, matching documents where the &lt;code&gt;user&lt;/code&gt; field is &lt;code&gt;"Alice"&lt;/code&gt;. This query is not searching for a string, but for a BSON &lt;code&gt;BsonString&lt;/code&gt;. Similarly, when querying for dates, BSON expects a &lt;code&gt;Date&lt;/code&gt; type, not a string. This query is similar to the following MongoDB query in the shell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;db.collection.find({ "user": "Alice" })
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Filters.exists()&lt;/code&gt; method finds documents that contain a specific field, regardless of its value. In this case, we are searching for all documents that have the &lt;code&gt;"activity"&lt;/code&gt; field. This query is equivalent to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;db.collection.find({ "activity": { $exists: true } })
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a document with &lt;code&gt;"user": "Alice"&lt;/code&gt; exists, the output will look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;--- Querying with BSON ---
Found Document: {
  "_id": "64e99b0b4321a1b9a6e4d8c7",
  "user": "Alice",
  "balance": 12345.67,
  "contacts": ["123-456-7890", "987-654-3210"],
  "address": {
    "city": "Dublin",
    "postalCode": "D02"
  },
  "tags": ["premium", "verified"],
  "activity": {
    "login": "2025-05-09T12:34:56.789Z",
    "status": "active"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If there are multiple documents containing the &lt;code&gt;"activity"&lt;/code&gt; field, each will be printed as a separate JSON object.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;Filters&lt;/code&gt; class provides a range of query operators, allowing us to construct complex queries using methods like &lt;code&gt;eq()&lt;/code&gt;, &lt;code&gt;exists()&lt;/code&gt;, &lt;code&gt;gt()&lt;/code&gt;, &lt;code&gt;lt()&lt;/code&gt;, and &lt;code&gt;in()&lt;/code&gt;. These methods allow us to write type-safe queries without dealing directly with BSON objects.&lt;/p&gt;

&lt;p&gt;This approach keeps the syntax concise and consistent, making the most of the Java types while MongoDB handles the BSON conversion automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Aggregation with BSON
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/docs/manual/aggregation/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB’s aggregation&lt;/a&gt; framework allows for complex data processing, transforming documents in a collection through a series of stages like filtering, grouping, and projecting. While we usually interact with MongoDB through Java types like &lt;code&gt;String&lt;/code&gt; or &lt;code&gt;int&lt;/code&gt;, the aggregation framework operates directly on BSON data, making it important for us to understand how BSON types are handled in aggregation operations.&lt;/p&gt;

&lt;p&gt;In the Java driver, we construct aggregation pipelines using the &lt;code&gt;Aggregates&lt;/code&gt; class, where each stage is represented as a BSON operation. Each stage in the pipeline processes BSON documents, applying transformations and producing a new BSON structure for the next stage.&lt;/p&gt;

&lt;p&gt;Each stage processes BSON data directly. MongoDB maintains BSON data types throughout the pipeline, preventing data loss and ensuring accuracy in operations like &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/sum/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;&lt;code&gt;$sum&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/avg/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;&lt;code&gt;$avg&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For instance, if we aggregate on a &lt;code&gt;Decimal128&lt;/code&gt; field, the aggregation framework maintains the precision:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bson&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;     &lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Accumulators&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sum&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"totalBalance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"$balance"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;balance&lt;/code&gt; is stored as a &lt;code&gt;Decimal128&lt;/code&gt;, the aggregation framework sums it as a &lt;code&gt;Decimal128&lt;/code&gt;. This is crucial for financial calculations where precision matters.&lt;/p&gt;

&lt;p&gt;In this example, we will build an aggregation pipeline that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Filters documents to include only those with a &lt;code&gt;balance&lt;/code&gt; field.
&lt;/li&gt;
&lt;li&gt;Groups documents by the &lt;code&gt;user&lt;/code&gt; field, calculating the sum of all &lt;code&gt;balance&lt;/code&gt; values for each user.
&lt;/li&gt;
&lt;li&gt;Projects the output to include the &lt;code&gt;user&lt;/code&gt; and the calculated &lt;code&gt;totalBalance&lt;/code&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;aggregateWithBSON&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n--- Aggregation with BSON ---"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bson&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"balance"&lt;/span&gt;&lt;span class="o"&gt;)),&lt;/span&gt;
        &lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Accumulators&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sum&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"totalBalance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"$balance"&lt;/span&gt;&lt;span class="o"&gt;)),&lt;/span&gt;
        &lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"$_id"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"totalBalance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;into&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Aggregation Results:"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toJson&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Match stage:&lt;/strong&gt; The first stage filters documents based on the existence of the &lt;code&gt;balance&lt;/code&gt; field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"balance"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This generates a BSON structure similar to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{ "$match": { "balance": { "$exists": true } } }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Filters.exists()&lt;/code&gt; method constructs a BSON document using the &lt;code&gt;$exists&lt;/code&gt; operator, targeting BSON fields directly. If &lt;code&gt;balance&lt;/code&gt; is a &lt;code&gt;Decimal128&lt;/code&gt; type, it remains a &lt;code&gt;Decimal128&lt;/code&gt; throughout the pipeline, maintaining precision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Group stage:&lt;/strong&gt; The group stage aggregates documents by a specified field. In BSON, the &lt;code&gt;_id&lt;/code&gt; field represents the grouping key. Here, we use the &lt;code&gt;user&lt;/code&gt; field as the key and calculate the sum of the &lt;code&gt;balance&lt;/code&gt; field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Accumulators&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sum&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"totalBalance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"$balance"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The resulting BSON structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "$group": {
    "_id": "$user",
    "totalBalance": { "$sum": "$balance" }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this stage, the key (&lt;code&gt;_id&lt;/code&gt;) is defined as the &lt;code&gt;user&lt;/code&gt; field. The &lt;code&gt;balance&lt;/code&gt; field is aggregated using the &lt;code&gt;$sum&lt;/code&gt; accumulator, and the result is stored in a new BSON field called &lt;code&gt;totalBalance&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project stage:&lt;/strong&gt; In the project stage, we transform the structure of the BSON document, selecting specific fields and renaming them as needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"$_id"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"totalBalance"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The resulting BSON structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "$project": {
    "user": "$_id",
    "totalBalance": 1
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This operation renames the &lt;code&gt;_id&lt;/code&gt; field to &lt;code&gt;user&lt;/code&gt; and includes the &lt;code&gt;totalBalance&lt;/code&gt; field in the output. Notice that &lt;code&gt;_id&lt;/code&gt; is no longer a BSON &lt;code&gt;ObjectId&lt;/code&gt; but a value derived from the group key, in this case, a &lt;code&gt;String&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If the collection contains the following documents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{ "user": "Alice", "balance": 5000 }
{ "user": "Alice", "balance": 3000 }
{ "user": "Bob", "balance": 7000 }
{ "user": "Alice", "balance": 2500 }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of the aggregation pipeline will look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Aggregation Results:
{"user": "Alice", "totalBalance": 10500}
{"user": "Bob", "totalBalance": 7000}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each document in the output is a BSON object resulting from the aggregation pipeline. The &lt;code&gt;user&lt;/code&gt; field is derived from the grouping key, and &lt;code&gt;totalBalance&lt;/code&gt; is the calculated sum of all &lt;code&gt;balance&lt;/code&gt; values per user.&lt;/p&gt;

&lt;h2&gt;
  
  
  POJO mapping: Bridging Java and BSON
&lt;/h2&gt;

&lt;p&gt;As we’ve seen, BSON is the native data format for storing documents in MongoDB. While BSON extends JSON with additional data types, Java developers typically don’t need to work directly with raw BSON. Instead, we can work with familiar Java objects and let the MongoDB driver handle the BSON conversion behind the scenes. This is where POJO (Plain Old Java Object) mapping comes into play.&lt;/p&gt;

&lt;p&gt;POJOs are often used for data encapsulation, which is the practice of separating business logic from data representation. If you want a deep understanding of POJO mapping with MongoDB, check out &lt;a href="https://www.mongodb.com/docs/drivers/java/sync/current/data-formats/document-data-format-pojo/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;our guide&lt;/a&gt;, but we will go over the basics and get up and running with a simple example.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;PojoCodecProvider&lt;/code&gt; allows MongoDB to automatically map Java objects to BSON documents and back. This not only simplifies data handling but also keeps our data model consistent with our Java classes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why POJO mapping?
&lt;/h3&gt;

&lt;p&gt;Without POJO mapping, we would need to manually convert Java objects into &lt;code&gt;Document&lt;/code&gt; objects and vice versa. This is error-prone and can quickly become cumbersome as our data model grows more complex.&lt;/p&gt;

&lt;p&gt;POJO mapping abstracts away the BSON conversion process. We define our Java classes, and the MongoDB driver handles the rest.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up POJO mapping in Java
&lt;/h3&gt;

&lt;p&gt;Before we define our Java classes, we need to configure the &lt;code&gt;PojoCodecProvider&lt;/code&gt;. This codec provider registers our Java classes for automatic BSON mapping.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.codecs.configuration.CodecProvider&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.codecs.configuration.CodecRegistry&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.codecs.pojo.PojoCodecProvider&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;static&lt;/span&gt; &lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;bson&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;codecs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;configuration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;CodecRegistries&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromProviders&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;static&lt;/span&gt; &lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;bson&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;codecs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;configuration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;CodecRegistries&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromRegistries&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;static&lt;/span&gt; &lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;MongoClientSettings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDefaultCodecRegistry&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CodecSetup&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;CodecRegistry&lt;/span&gt; &lt;span class="nf"&gt;getPojoCodecRegistry&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;CodecProvider&lt;/span&gt; &lt;span class="n"&gt;pojoCodecProvider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PojoCodecProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;automatic&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;fromRegistries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;getDefaultCodecRegistry&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;fromProviders&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pojoCodecProvider&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The codec registry combines the default &lt;a href="https://www.mongodb.com/docs/drivers/java/sync/current/data-formats/codecs/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=java+intro+to+bson+syndicated&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;BSON codecs&lt;/a&gt; with our custom POJO codecs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Defining a POJO class: User
&lt;/h3&gt;

&lt;p&gt;Now, let’s define a simple &lt;code&gt;User&lt;/code&gt; class that MongoDB will automatically map to BSON.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.codecs.pojo.annotations.BsonId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.codecs.pojo.annotations.BsonProperty&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@BsonId&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@BsonProperty&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"username"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@BsonProperty&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"member"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isMember&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;User&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="c1"&gt;// No-arg constructor required for POJO mapping  &lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;User&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isMember&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isMember&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;isMember&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;getAge&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setAge&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;isMember&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;isMember&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setMember&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isMember&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isMember&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;isMember&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@Override&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"User [id="&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;", username="&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;", age="&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;", member="&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;isMember&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"]"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;@BsonId&lt;/code&gt;&lt;/strong&gt;: Marks the &lt;code&gt;id&lt;/code&gt; field as the BSON &lt;code&gt;_id&lt;/code&gt; field
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;@BsonProperty&lt;/code&gt;&lt;/strong&gt;: Maps the &lt;code&gt;name&lt;/code&gt; field to the BSON key &lt;code&gt;username&lt;/code&gt;, and &lt;code&gt;isMember&lt;/code&gt; to &lt;code&gt;member&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Inserting and querying POJOs
&lt;/h3&gt;

&lt;p&gt;Now that we have our codec set up and our POJO classes defined, let’s see how we can insert and query these objects.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;
&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;userCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="c1"&gt;// ...&lt;/span&gt;
    &lt;span class="nc"&gt;CodecRegistry&lt;/span&gt; &lt;span class="n"&gt;codecRegistry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CodecSetup&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getPojoCodecRegistry&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoClients&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;CONNECTION_STRING&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;    

        &lt;span class="c1"&gt;// ...&lt;/span&gt;

        &lt;span class="n"&gt;userCollection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"users"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;withCodecRegistry&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;codecRegistry&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;demonstratePojoMapping&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;demonstratePojoMapping&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n--- POJO Mapping ---"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"John Doe"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="n"&gt;userCollection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;insertOne&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Inserted User: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  
    &lt;span class="n"&gt;userCollection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;find&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;into&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Retrieved Users: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;First, we make sure we register our codec. We then connect to a &lt;code&gt;User&lt;/code&gt; collection with this codec.
&lt;/li&gt;
&lt;li&gt;When the &lt;code&gt;User&lt;/code&gt; object is inserted, MongoDB will automatically convert it to a BSON document.
&lt;/li&gt;
&lt;li&gt;When querying, MongoDB will deserialize the BSON back into a &lt;code&gt;User&lt;/code&gt; object, maintaining type integrity.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  BSON representation of the user document
&lt;/h3&gt;

&lt;p&gt;When the &lt;code&gt;User&lt;/code&gt; object is inserted, MongoDB stores it as a BSON document. The BSON representation will look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "_id":{ "$oid":"68211b91221c1ce15490b565" },
    "age":{ "$numberInt":"30" },
    "member":true,
    "username":"John Doe"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;id&lt;/code&gt; field is mapped to the BSON &lt;code&gt;_id&lt;/code&gt; field.
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;name&lt;/code&gt; field is renamed to &lt;code&gt;username&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even though we’re working with plain Java objects, MongoDB is still converting these to BSON. This automatic conversion ensures that data types are preserved when stored in MongoDB.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why use POJO mapping?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cleaner code:&lt;/strong&gt;
POJO mapping eliminates the need to manually convert Java objects to BSON &lt;code&gt;Document&lt;/code&gt; objects, reducing boilerplate code.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type safety:&lt;/strong&gt;
MongoDB automatically handles type conversions, ensuring that data is deserialized to the correct Java type (e.g., &lt;code&gt;ObjectId&lt;/code&gt;, &lt;code&gt;Date&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nested structures:&lt;/strong&gt;
Complex nested objects are easily represented using embedded BSON documents, maintaining data structure and hierarchy.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What about custom conversions?
&lt;/h3&gt;

&lt;p&gt;The default POJO mapping behavior is plenty sufficient for most use cases, but MongoDB also provides options for advanced customization. We can define custom codecs, register additional conventions, or handle abstract types and enums using advanced configuration. For more advanced scenarios, check out our &lt;a href="https://mongodb.github.io/mongo-java-driver/4.11/driver/getting-started/pojos/" rel="noopener noreferrer"&gt;PojoCodecProvider documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;BSON is at the core of how MongoDB stores and transmits data. It extends JSON with additional data types like &lt;code&gt;ObjectId&lt;/code&gt;, &lt;code&gt;Decimal128&lt;/code&gt;, and &lt;code&gt;Timestamp&lt;/code&gt;, allowing MongoDB to handle richer and more complex data structures. While BSON is a binary format optimized for storage and traversal, the Java driver abstracts most of its complexity, allowing developers to work with familiar Java types.&lt;/p&gt;

&lt;p&gt;Throughout this tutorial, we explored how BSON types map to Java types and how we can interact with BSON data using the &lt;code&gt;Document&lt;/code&gt; class and the &lt;code&gt;PojoCodecProvider&lt;/code&gt;. We saw how nested structures, arrays, and raw BSON objects are constructed and manipulated, and how aggregation pipelines process BSON data directly within the database.&lt;/p&gt;

&lt;p&gt;While most operations in MongoDB can be performed using Java objects and the &lt;code&gt;Document&lt;/code&gt; class, understanding BSON is essential for tasks involving binary data, precision calculations with &lt;code&gt;Decimal128&lt;/code&gt;, and operations that require explicit data types like &lt;code&gt;BsonTimestamp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In every operation—document creation, querying, aggregation—BSON is working behind the scenes, ensuring that data types are consistent, optimized, and capable of handling complex structures. By understanding how Java types map to BSON, we can write more predictable queries, prevent data type mismatches, and take full advantage of MongoDB’s type system. For a deeper dive, explore the &lt;a href="https://bsonspec.org/" rel="noopener noreferrer"&gt;BSON Specification&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>java</category>
      <category>database</category>
      <category>data</category>
    </item>
    <item>
      <title>Spring AI and MongoDB: How to Build RAG Applications</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Tue, 20 May 2025 11:01:09 +0000</pubDate>
      <link>https://forem.com/mongodb/how-to-build-rag-applications-with-spring-ai-and-mongodb-5gaj</link>
      <guid>https://forem.com/mongodb/how-to-build-rag-applications-with-spring-ai-and-mongodb-5gaj</guid>
      <description>&lt;p&gt;&lt;em&gt;Co-authored with Josh Long, &lt;a href="https://spring.io/authors/joshlong" rel="noopener noreferrer"&gt;Spring Developer Advocate&lt;/a&gt; on the Spring team and Youtuber at &lt;a href="https://www.youtube.com/@coffeesoftware" rel="noopener noreferrer"&gt;YouTube.com/@coffeesoftware&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Spring AI 1.0 is now available after a significant development period influenced by rapid advancements in the AI field. Spring AI is a new project from the Spring team that gives you structured, declarative access to large language models, vector search, image generation, and more—directly from your Spring Boot apps. The release includes numerous essential new features for AI engineers.&lt;/p&gt;

&lt;p&gt;Java and Spring are in a prime spot to jump on this whole AI wave. Tons of companies are running their stuff on Spring Boot, which makes it super easy to plug AI into what they're already doing. You can basically link up your business logic and data right to those AI models without too much hassle.&lt;/p&gt;

&lt;p&gt;If you just want the code for this tutorial, check out our &lt;a href="https://github.com/mongodb-developer/springairag" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq9m9bwlel6jjonrdraxm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq9m9bwlel6jjonrdraxm.png" alt="Overview of app architecture" width="800" height="571"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Spring AI?
&lt;/h2&gt;

&lt;p&gt;Spring AI provides support for various AI models and technologies. &lt;strong&gt;Image models&lt;/strong&gt; can generate images given text prompts. &lt;strong&gt;Transcription models&lt;/strong&gt; can take audio and convert them to text. &lt;strong&gt;Embedding models&lt;/strong&gt; convert arbitrary data into vectors, which are data types optimized for semantic similarity search. &lt;strong&gt;Chat models&lt;/strong&gt; should be familiar! You’ve no doubt even had a brief conversation with one somewhere. Chat models are where most of the fanfare seems to be in the AI space. They're awesome. You can get them to help you correct a document or write a poem. (Just don’t ask them to tell a joke… yet.) They’re awesome but they have some issues.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzv22i2qg7buz1qer9tht.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzv22i2qg7buz1qer9tht.jpg" alt="AI patterns" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(The picture shown is used with permission from the Spring AI team lead Dr. Mark Pollack)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;AI models are powerful but inherently limited in a few key areas. Spring AI addresses these limitations with targeted solutions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Chat models and system prompts: Chat models are open-ended and prone to distraction. To guide their responses, you provide a system prompt—a structured directive that sets the tone and context for all interactions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Memory management: AI models have no inherent memory. To maintain context across interactions, you need to implement memory mechanisms that correlate past and current messages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tool calling: Models operate in isolated environments. However, with tool calling, you can expose specific functions that the model can invoke when necessary, allowing for multi-turn interactions without custom orchestration logic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data access and prompt engineering: Models don’t have access to your proprietary databases—nor would you want them to by default. Instead, you can inject context by appending relevant information to the prompt. But there’s a limit to how much data you can include.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Retrieval-augmented generation (RAG): Instead of stuffing the prompt with data, use a vector store like MongoDB Atlas to retrieve only the most relevant information. This targeted approach reduces complexity and response size, enhancing both accuracy and cost-efficiency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Response validation: AI models can confidently produce incorrect information. Evaluation models can validate or cross-check responses to mitigate these hallucinations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Integration and workflows with MCP: No AI system functions in isolation. With Model Context Protocol (MCP), you can connect your Spring AI applications to other MCP-enabled services, regardless of language or platform, creating structured, goal-driven workflows.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You can leverage familiar Spring Boot conventions with Spring AI’s streamlined autoconfigurations, accessible through &lt;a href="https://start.spring.io" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt;. Spring AI provides convenient Spring Boot autoconfigurations that give you the convention-over-configuration setup you’ve come to know and expect. And Spring AI supports observability with Spring Boot’s Actuator and the Micrometer project. It plays well with GraalVM and virtual threads, too, allowing you to build super fast and efficient AI applications that scale. &lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating Spring AI with MongoDB
&lt;/h2&gt;

&lt;p&gt;Whether you're new to MongoDB or already building apps with it, MongoDB offers a modern, flexible foundation for AI-driven development. As a document database, MongoDB stores data in a JSON-like format, making it a natural fit for handling unstructured and semi-structured data—like user input, chat messages, or documents.&lt;/p&gt;

&lt;p&gt;With &lt;a href="https://www.mongodb.com/products/platform/atlas-vector-search/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Atlas Vector Search&lt;/a&gt;, MongoDB extends this flexibility by letting you store and search vector embeddings alongside your application data—all in the same place. Vector embeddings are just numerical representations of your data. For example, a product description or support ticket might be turned into a 512-dimensional vector—a list of numbers capturing its semantic meaning. This simplifies your architecture: no need for a separate vector database, custom pipelines, or syncing between systems.&lt;/p&gt;

&lt;p&gt;If you're using Java, the &lt;a href="https://www.mongodb.com/docs/drivers/java/sync/current/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Java Sync Driver&lt;/a&gt; and &lt;a href="https://spring.io/projects/spring-ai" rel="noopener noreferrer"&gt;Spring AI&lt;/a&gt; make it easy to integrate semantic search and LLM-based features directly into your application.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is retrieval-augmented generation (RAG)?
&lt;/h3&gt;

&lt;p&gt;Retrieval-augmented generation is a powerful pattern for improving the quality of large language model (LLM) responses. It works by pairing a language model with a document retrieval system—like MongoDB Atlas Vector Search—to ground the model's output in real, up-to-date information.&lt;/p&gt;

&lt;p&gt;Why use RAG? Working with LLMs alone comes with limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Outdated knowledge: LLMs are trained on static snapshots of the internet and may not know about recent developments.
&lt;/li&gt;
&lt;li&gt;Lack of context: They can’t access your private or domain-specific data unless you provide it.
&lt;/li&gt;
&lt;li&gt;Hallucinations: Without access to authoritative context, LLMs may generate plausible-sounding but incorrect information.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RAG addresses these gaps by introducing real-world context into the generation process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ingest: Convert your internal documents or domain-specific data into vector embeddings and store them in a vector store like MongoDB Atlas.
&lt;/li&gt;
&lt;li&gt;Retrieve: When a user asks a question, use MongoDB Atlas Vector Search to find semantically relevant documents based on the meaning of the query—not just keyword matching.
&lt;/li&gt;
&lt;li&gt;Generate: Feed the retrieved documents to the LLM as context, so it can generate accurate, relevant responses.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This architecture is great for building chatbots, document Q&amp;amp;A systems, or internal assistants that can reason over your organization's knowledge, not just what the model was originally trained on.&lt;/p&gt;

&lt;p&gt;By combining Spring AI’s APIs with MongoDB Atlas Vector Search, you can build intelligent, production-ready applications without leaving the Java ecosystem. Let’s start wiring up your first RAG-powered AI app.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feudawofp4t5c39q9y4mi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feudawofp4t5c39q9y4mi.png" alt="RAG app flow description" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here’s a diagram outlining how our app will work. The user sends a question to the chat endpoint. The question is embedded using our AI model, and this embedding is then used to retrieve relevant documents in our database using vector search. These documents, as well as system prompt and optional chat memory, are applied to maintain context. These documents are injected into the prompt, and the LLM generates a context-aware response, which is then returned to the user.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the application: Meet the dogs
&lt;/h2&gt;

&lt;p&gt;We're building a fictitious dog adoption agency called &lt;em&gt;Pooch Palace&lt;/em&gt;. It's like a shelter where you can find and adopt dogs online! And just like most shelters, people are going to want to talk to somebody, to interview the dogs. We're going to build a service to facilitate that: an assistant. &lt;/p&gt;

&lt;p&gt;Our goal is to build an assistant to help us find the dog of (somebody's!) dream, Prancer, who is described rather hilariously as “a demonic, neurotic, man-hating, animal-hating, children-hating dog that looks like a gremlin.” This dog’s awesome. You might’ve heard about him. He went viral a few years ago when his owner was looking to find a new home for him. The ad was hysterical! Here’s the original &lt;a href="https://www.facebook.com/tyfanee.fortuna/posts/10219752628710467" rel="noopener noreferrer"&gt;post&lt;/a&gt;, in &lt;a href="https://www.buzzfeednews.com/article/amberjamieson/prancer-chihuahua" rel="noopener noreferrer"&gt;Buzzfeed News&lt;/a&gt;, in &lt;a href="https://www.usatoday.com/story/news/nation/2021/04/12/chihuahua-hates-men-and-children-goes-viral-facebook-meet-prancer/7186543002/" rel="noopener noreferrer"&gt;USA Today&lt;/a&gt;, and in the &lt;em&gt;&lt;a href="https://www.nytimes.com/2021/04/28/us/prancer-chihuahua-adopted-connecticut.html" rel="noopener noreferrer"&gt;New York Times&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;We’re going to build a simple HTTP endpoint that will use the Spring AI integration with an LLM. In this case, we’ll use Open AI, though you can use anything you’d like, including Ollama, Amazon Bedrock, Google Gemini, HuggingFace, and scores of others. It’s all supported through Spring AI to ask the AI model to help us find the dog of our dreams by analysing our question and deciding—after looking at the dogs in the shelter (and in our database)—which might be the best match for us. &lt;/p&gt;

&lt;h3&gt;
  
  
  Getting started: Project setup
&lt;/h3&gt;

&lt;p&gt;To follow along, you’ll need Java version 24 and Maven on your machine ready to run. You’ll also need a &lt;a href="https://www.mongodb.com/docs/atlas/getting-started/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas cluster set up&lt;/a&gt; (the M0 free tier will work), and an OpenAI account with an &lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt;API key&lt;/a&gt; ready to go. &lt;/p&gt;

&lt;p&gt;Go to &lt;a href="https://start.spring.io" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt;, create a new project with the artifact ID &lt;code&gt;mongodb&lt;/code&gt;, and add the following dependencies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Web
&lt;/li&gt;
&lt;li&gt;Actuator
&lt;/li&gt;
&lt;li&gt;GraalVM
&lt;/li&gt;
&lt;li&gt;Devtools
&lt;/li&gt;
&lt;li&gt;OpenAI
&lt;/li&gt;
&lt;li&gt;MongoDB Atlas Vector Database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’re using Java 24 and Apache Maven, but you can adjust as needed. Download the .zip file and open it in your preferred IDE. We need to put the following properties into our &lt;code&gt;application.properties&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.application.name=mongodb

# ai
spring.ai.openai.api-key=&amp;lt;YOUR_OPENAI_CREDENTIAL_HERE&amp;gt;

# mongodb 
spring.data.mongodb.uri=&amp;lt;YOUR_MONGODB_ATLAS_CONNECTION_STRING_HERE&amp;gt;
spring.data.mongodb.database=rag

# mongodb atlas vector store
spring.ai.vectorstore.mongodb.collection-name=vector_store
spring.ai.vectorstore.mongodb.initialize-schema=true

# scale
spring.threads.virtual.enabled=true

# health
management.endpoints.web.exposure.include=*
management.endpoint.health.show-details=always
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s walk through some of the configuration values. The &lt;code&gt;ai&lt;/code&gt; section  defines an OpenAI credential. You can and should externalize this as an environment variable, like this: &lt;code&gt;SPRING_AI_OPENAI_API_KEY&lt;/code&gt;. This way, you never risk accidentally checking in the credentials to your GitHub repository. &lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;mongodb&lt;/code&gt; section, we've got the usual bread-and-butter connectivity details. Again, you’ll get this from your MongoDB Atlas account and you’d do well to externalize it into an environment variable. &lt;/p&gt;

&lt;p&gt;Then, in the &lt;code&gt;mongodb atlas vector store&lt;/code&gt; section, we tell MongoDB Atlas Vector Search to initialize a collection called &lt;code&gt;vector_store&lt;/code&gt; for us. Setting &lt;code&gt;initialize-schema&lt;/code&gt; to true tells Spring AI to create the &lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-type/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;vector search index&lt;/a&gt; on our collection.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;scale&lt;/code&gt; section, we tell Spring Boot to configure Java’s vaunted virtual threads. We’ll revisit this later.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;health&lt;/code&gt; section, we tell Spring Boot’s Actuator to expose some details around the state of the application. We’ll revisit this later. &lt;/p&gt;

&lt;h3&gt;
  
  
  Initializing the vector store
&lt;/h3&gt;

&lt;p&gt;Now, let’s get some code going. In the main class, called &lt;code&gt;MongodbApplication.java&lt;/code&gt; in our example, we'll initialize the collection by reading some data from a &lt;code&gt;.json&lt;/code&gt; file into the MongDB Atlas Vector Store. This &lt;code&gt;.json&lt;/code&gt; file is available in the &lt;a href="https://github.com/mongodb-developer/springairag" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;. We need to download it and add it to the resources folder in our application with the name &lt;code&gt;dogs.json&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now, we’ll open our &lt;code&gt;MongodbApplication&lt;/code&gt; class, and add the necessary hints to read in our new resource.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@ImportRuntimeHints&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Hints&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nd"&gt;@SpringBootApplication&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MongodbApplication&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;SpringApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Resource&lt;/span&gt; &lt;span class="no"&gt;DOGS_JSON_FILE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ClassPathResource&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/dogs.json"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Hints&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;RuntimeHintsRegistrar&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="nd"&gt;@Override&lt;/span&gt;
        &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;registerHints&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RuntimeHints&lt;/span&gt; &lt;span class="n"&gt;hints&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ClassLoader&lt;/span&gt; &lt;span class="n"&gt;classLoader&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;hints&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;registerResource&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DOGS_JSON_FILE&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// TBD&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And now, we need to add this data to our vector store. Add the following code to the &lt;code&gt;MongodbApplication&lt;/code&gt; class, and we’ll take a look at what it does:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Bean&lt;/span&gt;
&lt;span class="nc"&gt;ApplicationRunner&lt;/span&gt; &lt;span class="nf"&gt;mongoDbInitialzier&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoTemplate&lt;/span&gt; &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                                         &lt;span class="nc"&gt;VectorStore&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                                         &lt;span class="nd"&gt;@Value&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${spring.ai.vectorstore.mongodb.collection-name}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                                         &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt; &lt;span class="n"&gt;objectMapper&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;collectionExists&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;estimatedCount&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;documentData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;DOGS_JSON_FILE&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getContentAsString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Charset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;defaultCharset&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;jsonNode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;objectMapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readTree&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documentData&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;jsonNode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;spliterator&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;forEachRemaining&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jsonNode1&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jsonNode1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"id"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;intValue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jsonNode1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;textValue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jsonNode1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"description"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;textValue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;dogument&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"id: %s, name: %s, description: %s"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;formatted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;
                &lt;span class="o"&gt;));&lt;/span&gt;
                &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dogument&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
            &lt;span class="o"&gt;});&lt;/span&gt;
        &lt;span class="o"&gt;};&lt;/span&gt;
 &lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runner ingests data from the &lt;code&gt;dogs.json&lt;/code&gt; file on the classpath and writes it to MongoDB Atlas using the Spring AI &lt;code&gt;VectorStore&lt;/code&gt; interface. Each record is written as a simple string—&lt;code&gt;id: %s, name: %s, description: %s&lt;/code&gt;. The format doesn’t matter as long as it’s consistent. This is what we’ll be using for generating our embedding, so make sure it includes all the data you want to reference. &lt;/p&gt;

&lt;p&gt;The &lt;code&gt;VectorStore&lt;/code&gt; handles both embedding generation and storage in one step. It uses an OpenAI-backed &lt;code&gt;EmbeddingModel&lt;/code&gt; behind the scenes to convert each string into a vector, then stores that vector in MongoDB Atlas.&lt;/p&gt;

&lt;p&gt;If the target collection doesn’t already exist, Spring AI will create it and automatically define a vector index on the &lt;code&gt;embedding&lt;/code&gt; field. This means the collection and its index are ready for similarity search immediately after ingestion, with no manual setup required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Building the assistant controller
&lt;/h3&gt;

&lt;p&gt;Now, the easy part: building the assistant! We’ll define the &lt;code&gt;AdoptionController&lt;/code&gt;. This is a simple Spring MVC controller that responds to requests and interacts with the AI model to recommend dogs based on user input.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;
&lt;span class="nd"&gt;@Controller&lt;/span&gt;
&lt;span class="nd"&gt;@ResponseBody&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AdoptionController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ChatClient&lt;/span&gt; &lt;span class="n"&gt;ai&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;InMemoryChatMemoryRepository&lt;/span&gt; &lt;span class="n"&gt;memoryRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nc"&gt;AdoptionController&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChatClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Builder&lt;/span&gt; &lt;span class="n"&gt;ai&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;VectorStore&lt;/span&gt; &lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;system&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""
                You are an AI-powered assistant to help people adopt a dog from the adoption
                agency named Pooch Palace with locations in Antwerp, Seoul, Tokyo, Singapore, Paris,
                Mumbai, New Delhi, Barcelona, San Francisco, and London. Information about the dogs available
                will be presented below. If there is no information, then return a polite response suggesting we
                don't have any dogs available.
                """&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;memoryRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;InMemoryChatMemoryRepository&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ai&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;defaultSystem&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;defaultAdvisors&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;QuestionAnswerAdvisor&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;/// TBD&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So far, all we’ve defined is a constructor into which we’ve injected the Spring AI &lt;code&gt;ChatClient.Builder&lt;/code&gt;. The &lt;code&gt;ChatClient&lt;/code&gt; is the center piece API for interacting with the auto-configured &lt;code&gt;ChatModel&lt;/code&gt; representing our connection to OpenAI (or Ollama or whatever). &lt;/p&gt;

&lt;p&gt;We want all interactions with the chat model to have some sensible defaults, including a system prompt. The system prompt gives the model a framing through which it should attempt to answer all subsequent questions. So, in this case, we’ve instructed it to answer all questions as though it were working at our fictitious adoption agency called &lt;em&gt;Pooch Palace&lt;/em&gt;. As a result, it should politely decline or deflect any questions that don’t relate to our dog adoption agency. &lt;/p&gt;

&lt;p&gt;In addition, we use a &lt;code&gt;QuestionAnswerAdvisor&lt;/code&gt; backed by a MongoDB &lt;code&gt;VectorStore&lt;/code&gt;. This ensures that the model attempts to answer questions based only on relevant documents—specifically, the vector-matched entries from our database of adoptable dogs. &lt;/p&gt;

&lt;p&gt;We’ve also configured an advisor. Advisors are a Spring AI concept. Think of them as filters, or interceptors. They pre- and post-process requests made to the large language model. Here, the &lt;code&gt;QuestionAnswerAdvisor&lt;/code&gt; lives in a separate dependency we need to explicitly add to our &lt;code&gt;pom.xml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;dependency&amp;gt;
   &amp;lt;groupId&amp;gt;org.springframework.ai&amp;lt;/groupId&amp;gt;
   &amp;lt;artifactId&amp;gt;spring-ai-advisors-vector-store&amp;lt;/artifactId&amp;gt;
&amp;lt;/dependency&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This advisor will search the MongoDB Atlas vector store on each request and do a similarity search, looking for data that might be germane to the user’s request. When somebody comes along inquiring about dogs of a particular characteristic, it will look for dogs that might match the characteristic. The results we get back will be a subselection of all the dogs available for adoption. We’ll transmit only that relevant subselection to the model. Remember, each interaction with a model has costs—both dollars and cents (or Euros!) and complexity costs. So the less data we send, the better. The model, then, takes our system prompt, the user’s question, and the included data from the vector store into consideration when formulating its response. &lt;/p&gt;

&lt;p&gt;Models don't remember things. Think of them like Dory the goldfish from &lt;em&gt;Finding Nemo&lt;/em&gt;. So we configure one more advisor which keeps a running transcript of the conversation between you and it and then reminds the model of what’s been said. This way, it'll remember you from one request to another. &lt;/p&gt;

&lt;p&gt;Now, we’ll add an endpoint to our &lt;code&gt;AdoptionController&lt;/code&gt; to communicate with our AI assistant, with a little bit of memory. This will allow our assistant to maintain a local chat log (10 messages back), to keep track of the conversation, and provide a bit more context to our LLM.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;   &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/{user}/dogs/assistant"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;inquire&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@PathVariable&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MessageWindowChatMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatMemoryRepository&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memoryRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxMessages&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;memoryAdvisor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MessageChatMemoryAdvisor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ai&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;advisors&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;advisors&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memoryAdvisor&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;param&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChatMemory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;CONVERSATION_ID&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;call&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With all things taken together, you can now launch the program and then try it out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http :8080/tkelly/dogs/assistant question==”do you have any neurotic dogs?”
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With everything in place, it should return that there’s this dog named Prancer we might be interested in. (Boy, would we!)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Yes, we have a neurotic dog available for adoption. Meet Prancer, a demonic, neurotic dog who is not fond of humans, animals, or children, and resembles a gremlin. If you're interested in Prancer or would like more information, feel free to ask!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And there you have it. In no time at all, we went from zero to AI hero, making it easier to connect people with the dogs of their dreams… or nightmares! &lt;/p&gt;

&lt;h2&gt;
  
  
  Bring AI to production
&lt;/h2&gt;

&lt;p&gt;Spring AI does more than just simple pet projects. It is designed for enterprise applications, and to fit in with your existing Spring applications. Transitioning to production involves ensuring security, scalability, and observability. Let’s break it down:&lt;/p&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;p&gt;It’s trivial to use Spring Security to lock down this web application. You could use the authenticated &lt;a href="https://docs.spring.io/spring-session/reference/guides/boot-findbyusername.html" rel="noopener noreferrer"&gt;&lt;code&gt;Principal#getName&lt;/code&gt;&lt;/a&gt; to use as the conversation ID too. What about the data stored in the collections? You can use MongoDB’s robust &lt;a href="https://www.mongodb.com/products/capabilities/security/encryption/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;data encryption at rest&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scalability
&lt;/h3&gt;

&lt;p&gt;Each time you make an HTTP request to an LLM (or many databases), you’re blocking IO. This is a waste of a perfectly good thread. Threads aren't meant to just sit idle, waiting. Java 21 gives us virtual threads, which—for sufficiently IO-bound services—can dramatically improve scalability. &lt;/p&gt;

&lt;p&gt;Enable them in &lt;code&gt;application.properties&lt;/code&gt; with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.threads.virtual.enabled=true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead of waiting for IO to complete, virtual threads free up resources, allowing your application to scale more effectively under load.&lt;/p&gt;

&lt;h3&gt;
  
  
  GraalVM native images
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.graalvm.org/" rel="noopener noreferrer"&gt;GraalVM&lt;/a&gt; compiles Java applications to native executables, reducing startup time and memory usage. Here’s how to compile your Spring AI app.&lt;/p&gt;

&lt;p&gt;If you’ve got that set up as your SDK, you can turn this Spring AI application into an operating system and architecture specific native image with ease:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./mvnw -DskipTests -Pnative native:compile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This takes a minute or so on most machines but once it is done, you can run the binary with ease.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./target/mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This program will start up in a fraction of the time that it did on the JVM. You might want to comment on the &lt;code&gt;ApplicationRunner&lt;/code&gt; we created earlier since it's going to do IO on startup time, significantly delaying that startup. On my machine, it starts up in less than a tenth of a second. &lt;/p&gt;

&lt;p&gt;Even better, you should observe that the application takes a very small fraction of the RAM it would've otherwise taken on the JVM. &lt;/p&gt;

&lt;h3&gt;
  
  
  Containerize for Docker
&lt;/h3&gt;

&lt;p&gt;“That’s all very well and good,” you might say, “but I need to get this running on any cloud platform, and that means getting it into the shape of a Docker image.” Easy!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./mvnw -DskipTests -Pnative spring-boot:build-image
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stand back. This might take another minute still. When it finishes, you’ll see it’s printed out the name of the Docker image that’s been generated. &lt;/p&gt;

&lt;p&gt;You can run it, remembering to override the hosts and ports of things it would've referenced on your host. &lt;/p&gt;

&lt;p&gt;We’re on macOS, and amazingly, this application, when run on a macOS virtual machine emulating Linux, runs &lt;em&gt;even faster&lt;/em&gt; (and right from the jump, too) then it would’ve on macOS directly. Amazing!&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability
&lt;/h3&gt;

&lt;p&gt;This application’s so darn good I’ll bet it’ll make headlines, just like our dog Prancer. And when that happens, you’d be well advised to keep an eye on your system resources and, importantly, the token count. All requests to an LLM have a cost—at least one of complexity, if not dollars and cents. &lt;/p&gt;

&lt;p&gt;Spring Boot Actuator, powered by &lt;a href="https://micrometer.io/" rel="noopener noreferrer"&gt;Micrometer&lt;/a&gt;, provides metrics out of the box.&lt;/p&gt;

&lt;p&gt;Access the metrics endpoint to monitor token usage and other key indicators:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:8080/actuator/metrics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nice! You can use Micrometer to forward those metrics to your &lt;a href="https://www.mongodb.com/docs/manual/core/timeseries-collections/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB time-series&lt;/a&gt; database to get a single pane of glass, a dashboard. &lt;/p&gt;

&lt;h3&gt;
  
  
  MongoDB Atlas: Fully managed and scalable
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/atlas/database/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas&lt;/a&gt; is the managed cloud service for MongoDB, offering scalability and high availability across multiple cloud providers. With Atlas, MongoDB handles backups, monitoring, upgrades, and patches—letting you focus on building your application, not maintaining infrastructure. If managing databases isn’t core to your business, why not delegate it to the people who built MongoDB?&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;There you have it! You've just built a production-worthy, AI-ready, Spring AI- and MongoDB-powered application in no time at all. We’ve only begun to scratch the surface. Check out Spring AI 1.0 at the &lt;a href="https://start.spring.io" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt; today and learn more about MongoDB Atlas and the &lt;a href="https://docs.spring.io/spring-ai/reference/api/vectordbs/mongodb.html" rel="noopener noreferrer"&gt;vector store support&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want to see more about what you can do with MongoDB, Spring, and AI, check out our tutorial on &lt;a href="https://dev.to/mongodb/building-a-real-time-ai-fraud-detection-system-with-spring-kafka-and-mongodb-2jbn"&gt;building a real-time AI fraud detection system with Spring Kafka and MongoDB&lt;/a&gt;. If you’re more interested in just getting started with Spring and MongoDB, we have a &lt;a href="https://www.mongodb.com/developer/products/mongodb/springdata-getting-started-with-java-mongodb/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Spring+AI+MongoDB+RAG&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;tutorial on that too&lt;/a&gt;!&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>spring</category>
      <category>ai</category>
      <category>java</category>
    </item>
    <item>
      <title>Building a Real-Time Market Data Aggregator with Kafka and MongoDB</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Thu, 15 May 2025 12:26:42 +0000</pubDate>
      <link>https://forem.com/mongodb/building-a-real-time-market-data-aggregator-with-kafka-and-mongodb-27ag</link>
      <guid>https://forem.com/mongodb/building-a-real-time-market-data-aggregator-with-kafka-and-mongodb-27ag</guid>
      <description>&lt;p&gt;It starts the same way every time: an RSS feed, a CSV dump, a dashboard wired together with hope and duct tape. Maybe the data’s late. Maybe the moving average lags just enough to miss the signal. Or maybe it’s just another system straining under the weight of “real-time” without the architecture to back it up.&lt;/p&gt;

&lt;p&gt;In this tutorial, we’ll build something better. Using Spring Kafka, we’ll stream simulated live stock data into a Kafka topic. From there, Kafka Streams will compute the Relative Strength Index (RSI) on the fly, and we’ll persist it to MongoDB using Spring Data MongoDB. On the historical side, we’ll use MongoDB’s aggregation framework to calculate and analyze simple and exponential moving averages (SMA and EMA) using the MongoDB aggregation pipeline—no third-party data warehouse required.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvis01uvqvylf3tzf9ouf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvis01uvqvylf3tzf9ouf.png" alt="Main dashboard for our app" width="800" height="506"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;By the end, we’ll have a real-time financial pipeline that’s compact and fast. Feel free to clone the &lt;a href="https://github.com/mongodb-developer/Real-Time-Financial-Aggregator" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;, and use the front end I reference throughout the tutorial.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before getting started, make sure you have the following installed and ready:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Java 17+  (I use 21): &lt;a href="https://adoptopenjdk.net/" rel="noopener noreferrer"&gt;Download Java&lt;/a&gt; or use your preferred distribution
&lt;/li&gt;
&lt;li&gt;Maven 3.9+: &lt;a href="https://maven.apache.org/download.cgi" rel="noopener noreferrer"&gt;Download Maven&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;MongoDB Atlas cluster, a &lt;a href="https://www.mongodb.com/cloud/atlas/register/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Financial+Aggregator+Spring+Kafka&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;free tier M0&lt;/a&gt; will work perfectly
&lt;/li&gt;
&lt;li&gt;Apache Kafka 3.9+: (locally installed or accessible remotely) &lt;a href="https://kafka.apache.org/downloads" rel="noopener noreferrer"&gt;Download Kafka&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Creating our Spring application
&lt;/h2&gt;

&lt;p&gt;To keep things simple, we’ll use &lt;a href="https://start.spring.io/" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt; to scaffold our project. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fohewpwatrxygxzz65sd7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fohewpwatrxygxzz65sd7.png" alt="Spring Initializr dependencies" width="800" height="446"&gt;&lt;/a&gt;&lt;br&gt;
Choose the following options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Project: Maven
&lt;/li&gt;
&lt;li&gt;Language: Java
&lt;/li&gt;
&lt;li&gt;Spring Boot Version: 3.4.4
&lt;/li&gt;
&lt;li&gt;Group: &lt;code&gt;com.mongodb&lt;/code&gt; (our your own preferred group)
&lt;/li&gt;
&lt;li&gt;Artifact: &lt;code&gt;FinancialAggregator&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Name: &lt;code&gt;FinancialAggregator&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Java version: 21
&lt;/li&gt;
&lt;li&gt;Dependencies:

&lt;ul&gt;
&lt;li&gt;Spring Web
&lt;/li&gt;
&lt;li&gt;Spring Data MongoDB
&lt;/li&gt;
&lt;li&gt;Spring for Apache Kafka
&lt;/li&gt;
&lt;li&gt;Kafka Streams&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s what each dependency will do for us:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spring Web: Gives us everything we need to expose REST APIs with Spring Boot. We’ll use this to serve our financial data through HTTP endpoints.
&lt;/li&gt;
&lt;li&gt;Spring Data MongoDB: Provides convenient, idiomatic access to MongoDB using Spring Data repositories. It allows us to interact with our database in a way that feels familiar as a Spring developer.
&lt;/li&gt;
&lt;li&gt;Spring for Apache Kafka: Adds Spring Boot integration for Kafka, including &lt;code&gt;KafkaTemplate&lt;/code&gt; for sending messages and configuration support for producers and consumers.
&lt;/li&gt;
&lt;li&gt;Kafka Streams: The actual stream processing engine. This powers our calculations on our stock data in real time. Spring will wire this into the app using &lt;code&gt;@EnableKafkaStreams&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With this setup, we’ll have a data pipeline that simulates live financial data, processes it on the fly, stores it, and exposes it, as well as allowing us to analyze trends in historic data.&lt;/p&gt;

&lt;p&gt;Once generated, extract the zip and open it in your preferred IDE.&lt;/p&gt;
&lt;h2&gt;
  
  
  Setting up MongoDB
&lt;/h2&gt;

&lt;p&gt;Before we can start processing and storing data, we need a MongoDB cluster ready to receive it.&lt;/p&gt;

&lt;p&gt;If you don’t already have a MongoDB Atlas cluster, you can follow &lt;a href="https://www.mongodb.com/docs/atlas/getting-started/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Financial+Aggregator+Spring+Kafka&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;this quick start guide&lt;/a&gt; to set one up for free. Once our cluster is running, we can use either the Atlas UI or &lt;code&gt;mongosh&lt;/code&gt; to configure the necessary collections.&lt;/p&gt;
&lt;h3&gt;
  
  
  What MongoDB does in this project
&lt;/h3&gt;

&lt;p&gt;MongoDB acts as our primary storage engine. It stores both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time RSI data produced by Kafka Streams.
&lt;/li&gt;
&lt;li&gt;Historical stock market data used to calculate SMA and EMA via &lt;a href="https://www.mongodb.com/docs/manual/aggregation/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Financial+Aggregator+Spring+Kafka&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;aggregations&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’ll take advantage of time series collections in MongoDB, which are optimized for timestamped data like stock prices. This lets us efficiently query, sort, and analyze trends over time.&lt;/p&gt;
&lt;h3&gt;
  
  
  Creating time series collections in MongoDB Atlas
&lt;/h3&gt;

&lt;p&gt;Once your cluster is ready, follow these steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the MongoDB Atlas UI and navigate to your cluster.
&lt;/li&gt;
&lt;li&gt;Click "Browse Collections."
&lt;/li&gt;
&lt;li&gt;Create a new database named &lt;code&gt;Stock_Market_Data&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;Inside this database, create two time series collections:

&lt;ul&gt;
&lt;li&gt;Collection Name: &lt;code&gt;Live_Data&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Time Field: &lt;code&gt;date&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Granularity: Seconds
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Collection Name: &lt;code&gt;Historical_Data&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Time Field: &lt;code&gt;date&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Metafield: company (we'll see why this is important later)
&lt;/li&gt;
&lt;li&gt;Granularity: Seconds&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That’s all you need to start writing and reading real-time stock data with MongoDB. Next, we’ll simulate live market data and send it to Kafka.&lt;/p&gt;
&lt;h2&gt;
  
  
  Configuring our MongoDB connection
&lt;/h2&gt;

&lt;p&gt;From the Atlas UI, we can &lt;a href="https://www.mongodb.com/docs/guides/atlas/connection-string/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Financial+Aggregator+Spring+Kafka&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;get our connection string&lt;/a&gt;, and bring it back to our application properties.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spring.application.name=Financial Aggregator  

spring.data.mongodb.uri=&amp;lt;Your-Connection-String&amp;gt;
spring.data.mongodb.database=Stock_Market_Data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll also name the database we're using, &lt;code&gt;Stock_Market_Data&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating our live data
&lt;/h2&gt;

&lt;p&gt;Our live data won’t be truly live. Instead, we’ll simulate and stream it into the system. There are plenty of APIs out there that offer real-time stock data, but to keep this demo self-contained and avoid external dependencies and key generation/management, we’ll mock the data ourselves. That said, if you’d prefer to plug in a real API, go for it—this setup won’t get in your way.&lt;/p&gt;

&lt;h3&gt;
  
  
  Our live data model
&lt;/h3&gt;

&lt;p&gt;Start by creating a package called &lt;code&gt;model&lt;/code&gt; in your application. Inside it, add a new class named &lt;code&gt;LiveStockData&lt;/code&gt;. I came to understand far too late in the game that this could be confused with an agricultural term, but alas.&lt;/p&gt;

&lt;p&gt;This class is our POJO—used to map simulated stock data to documents in MongoDB:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.annotation.Id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.TimeSeries&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Live_Data"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="nd"&gt;@TimeSeries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeField&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@Id&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;rsi&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="nf"&gt;getDate&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setDate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getCompany&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCompany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;company&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getCloseLast&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCloseLast&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;closeLast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getOpen&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setOpen&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;open&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getLow&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setLow&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;low&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getVolume&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setVolume&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;volume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getHigh&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setHigh&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;high&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getRsi&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;rsi&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setRsi&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;rsi&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;rsi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rsi&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This class gives us enough fields to mimic the structure of a real stock market API and power a candlestick chart. It also includes a field for the RSI (Relative Strength Index), which we'll calculate later.&lt;/p&gt;

&lt;p&gt;A lot of this is fairly standard getter/setters, but let's take a look at the annotations we're adding to let Spring know what to do with this class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Live_Data"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells Spring Data MongoDB to map this class to the &lt;code&gt;Live_Data&lt;/code&gt; collection in MongoDB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@TimeSeries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeField&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This annotation enables MongoDB’s time series capabilities. We’re specifying &lt;code&gt;date&lt;/code&gt; as the time field, which allows us to efficiently store and query stock data over time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure Kafka producer
&lt;/h3&gt;

&lt;p&gt;Next, we’ll set up a Kafka producer to stream our simulated stock data. Create a new &lt;code&gt;config&lt;/code&gt; package, and inside it, add a class named &lt;code&gt;KafkaProducerConfig&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The Kafka producer is responsible for publishing messages, in our case, &lt;code&gt;LiveStockData&lt;/code&gt; objects, to a Kafka topic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.config&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.LiveStockData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.clients.producer.ProducerConfig&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.common.serialization.StringSerializer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Bean&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Configuration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.core.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.support.serializer.JsonSerializer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.HashMap&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Map&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Configuration&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;KafkaProducerConfig&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ProducerFactory&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;producerFactory&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ProducerConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BOOTSTRAP_SERVERS_CONFIG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"localhost:9092"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ProducerConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;KEY_SERIALIZER_CLASS_CONFIG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;StringSerializer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ProducerConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;VALUE_SERIALIZER_CLASS_CONFIG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;JsonSerializer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DefaultKafkaProducerFactory&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;producerFactory&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we configure a &lt;code&gt;ProducerFactory&lt;/code&gt; that knows how to serialize our data using &lt;code&gt;JsonSerializer&lt;/code&gt;. We also state we will be talking to the broker on &lt;code&gt;localhost:9092&lt;/code&gt;. The &lt;code&gt;KafkaTemplate&lt;/code&gt; is a convenient way to send messages to Kafka topics from anywhere in the app.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;@Configuration&lt;/code&gt; annotation just tells Spring this is a configuration class.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure Kafka streams
&lt;/h3&gt;

&lt;p&gt;Before we can wire up the stream logic, we need to tell Kafka Streams how to serialize and deserialize our custom &lt;code&gt;LiveStockData&lt;/code&gt; objects. To do this, we'll create a custom SerDe (serializer/deserializer).&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;utility&lt;/code&gt; package, and inside it, add the &lt;code&gt;LiveStockDataSerde&lt;/code&gt; class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.utility&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.LiveStockData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.common.serialization.Serdes&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.support.serializer.JsonDeserializer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.support.serializer.JsonSerializer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LiveStockDataSerde&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Serdes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;WrapperSerde&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;LiveStockDataSerde&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="kd"&gt;super&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JsonSerializer&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(),&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JsonDeserializer&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ll use this class when defining our stream topology to ensure Kafka Streams can work directly with our domain objects without manual conversion—basically, so Kafka Streams knows how to convert bytes to Java objects and back.&lt;/p&gt;

&lt;p&gt;In the same &lt;code&gt;config&lt;/code&gt; package as our &lt;code&gt;KafkaProducerConfig&lt;/code&gt;, we also need to create a &lt;code&gt;KafkaStreamsConfig&lt;/code&gt; class. The &lt;code&gt;KafkaStreamConfig&lt;/code&gt; will configure the Kafka Streams API to consume our messages and process them in real time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.config&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.common.serialization.Serdes&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.streams.StreamsConfig&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Bean&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Configuration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.annotation.EnableKafkaStreams&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.annotation.KafkaStreamsDefaultConfiguration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.config.KafkaStreamsConfiguration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.support.serializer.JsonDeserializer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.HashMap&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Map&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  


&lt;span class="nd"&gt;@Configuration&lt;/span&gt;  
&lt;span class="nd"&gt;@EnableKafkaStreams&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;KafkaStreamsConfig&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaStreamsDefaultConfiguration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DEFAULT_STREAMS_CONFIG_BEAN_NAME&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="nc"&gt;KafkaStreamsConfiguration&lt;/span&gt; &lt;span class="nf"&gt;kStreamsConfig&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;props&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  
        &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StreamsConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;APPLICATION_ID_CONFIG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"rsi-streams-app"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StreamsConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BOOTSTRAP_SERVERS_CONFIG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"localhost:9092"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StreamsConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DEFAULT_KEY_SERDE_CLASS_CONFIG&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Serdes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;String&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getClass&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
        &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JsonDeserializer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TRUSTED_PACKAGES&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"com.mongodb.financialaggregator.model"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JsonDeserializer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;VALUE_DEFAULT_TYPE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"com.mongodb.financialaggregator.model.LiveStockData"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JsonDeserializer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;USE_TYPE_INFO_HEADERS&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;KafkaStreamsConfiguration&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;props&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sets up the Kafka Streams runtime with the basic configurations it needs: a unique application ID, Kafka broker address, and the expected serialization types. It also ensures that our stream can deserialize &lt;code&gt;LiveStockData&lt;/code&gt; objects correctly.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;@EnableKafkaStreams&lt;/code&gt; tells Spring Boot to activate Kafka Streams support.&lt;/p&gt;

&lt;p&gt;We also have some important configurations set in our &lt;code&gt;@Bean&lt;/code&gt;. &lt;code&gt;KafkaStreamsConfiguration&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;APPLICATION_ID_CONFIG&lt;/code&gt;: A unique name for this Kafka Streams application. This is required. This ID is how Kafka tracks our app’s state and progress across restarts. Changing it will reset our app’s stream state, including any local state stores or offsets.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BOOTSTRAP_SERVERS_CONFIG&lt;/code&gt;: Where to find our Kafka cluster.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DEFAULT_KEY_SERDE_CLASS_CONFIG&lt;/code&gt;: Uses &lt;code&gt;Serdes.String()&lt;/code&gt;. We expect keys to be strings.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;JsonDeserializer&lt;/code&gt; config: Ensures Kafka Streams knows how to read our custom object (&lt;code&gt;LiveStockData&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're new to Kafka, you might be wondering, "Why not just use a Kafka Consumer?" Well Kafka Streams isn’t just a consumer—it’s a full stream processing library built on top of Kafka. It lets us define processing logic as a topology of transformations, stateful operations, and sinks. In this case, we’ll use it to calculate RSI values as they come in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Simulating live stock data
&lt;/h2&gt;

&lt;p&gt;Now, I’m not going to pretend this is the most impressive stock market simulator in the world. It’s not. But it gets the job done. It gives us consistent, dynamic data to push into Kafka and eventually graph. That’s all we need.&lt;/p&gt;

&lt;p&gt;I've added some logic here just to generate some fluctuating data for an imaginary company called "Doof."&lt;/p&gt;

&lt;p&gt;We'll create a &lt;code&gt;service&lt;/code&gt; package and add a class named &lt;code&gt;SimulateLiveData&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.LiveStockData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.core.KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.scheduling.annotation.Scheduled&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.math.BigDecimal&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.math.RoundingMode&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.time.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Random&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SimulateLiveData&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;currentSimulatedDate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;previousClose&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;COMPANY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"DOOF"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="no"&gt;INITIAL_OPEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="no"&gt;INITIAL_VOLUME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1000.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;trendSlope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.002&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;trendingUp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;trendCounter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;oscillationPeriod&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;oscillationAmplitude&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.5&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;SimulateLiveData&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kafkaTemplate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentSimulatedDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;LocalDate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;atStartOfDay&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ZoneId&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;systemDefault&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toInstant&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;previousClose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;INITIAL_OPEN&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt; &lt;span class="nf"&gt;simulateLiveData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;LiveStockData&lt;/span&gt; &lt;span class="n"&gt;stock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setDate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentSimulatedDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setCompany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;COMPANY&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;trendAdjustment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trendingUp&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;trendSlope&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;trendSlope&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;oscillation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;oscillationAmplitude&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
                &lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sin&lt;/span&gt;&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;PI&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;trendCounter&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;oscillationPeriod&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;marketBaseline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;previousClose&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;trendAdjustment&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;oscillation&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;hasVolatility&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextDouble&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;volatilityBoost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hasVolatility&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;randomBetween&lt;/span&gt;&lt;span class="o"&gt;(-&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;marketBaseline&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;volatilityBoost&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;close&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;randomBetween&lt;/span&gt;&lt;span class="o"&gt;(-&lt;/span&gt;&lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;randomBetween&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;min&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;randomBetween&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;INITIAL_VOLUME&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;randomBetween&lt;/span&gt;&lt;span class="o"&gt;(-&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setOpen&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setCloseLast&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setHigh&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setLow&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setVolume&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="n"&gt;previousClose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;trendCounter&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt;

        &lt;span class="c1"&gt;// Advance to the next simulated day&lt;/span&gt;
        &lt;span class="nc"&gt;LocalDate&lt;/span&gt; &lt;span class="n"&gt;next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;currentSimulatedDate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toInstant&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;atZone&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ZoneId&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;systemDefault&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toLocalDate&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;plusDays&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;currentSimulatedDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;atStartOfDay&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ZoneId&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;systemDefault&lt;/span&gt;&lt;span class="o"&gt;()).&lt;/span&gt;&lt;span class="na"&gt;toInstant&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

        &lt;span class="c1"&gt;// Flip the trend every 60 days&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;trendCounter&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;trendingUp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;trendingUp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;stock&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;randomBetween&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;min&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextDouble&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;valueOf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setScale&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;RoundingMode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;HALF_UP&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;doubleValue&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This function simulates somewhat real-looking stock data with some randomness, gentle trends, and a hint of periodic volatility. It resets daily and flips direction every 60 “days” to prevent things from spiraling to the moon or zero.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scheduling the simulation
&lt;/h3&gt;

&lt;p&gt;Now, let’s wire it up so that we publish this fake stock data to Kafka automatically. I've arbitrarily decided once per second.&lt;/p&gt;

&lt;p&gt;Inside the same &lt;code&gt;SimulateLiveData&lt;/code&gt; class, add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Scheduled&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fixedRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;simulateEverySecond&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;LiveStockData&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulateLiveData&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"========== STOCK DEBUG INFO =========="&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Company:       "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCompany&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Date (raw):    "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDate&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Date (class):  "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDate&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDate&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getClass&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"null"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Open:          "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getOpen&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Close/Last:    "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCloseLast&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"High:          "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getHigh&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Low:           "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLow&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Volume:        "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getVolume&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"======================================"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"stock-prices"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCompany&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Sent successfully.\n"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Insert failed: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thanks to &lt;code&gt;@Scheduled(fixedRate = 1000)&lt;/code&gt;, this method is triggered every second. It generates a new stock datapoint, prints it out for debugging, and sends it to the &lt;code&gt;stock-prices&lt;/code&gt; Kafka topic using the &lt;code&gt;KafkaTemplate&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I've added some logging and error handling just to help with the clarity when we're running the application.&lt;/p&gt;

&lt;p&gt;Finally, make sure you enable scheduling in your application entrypoint by adding &lt;code&gt;@EnableScheduling&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.SpringApplication&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.autoconfigure.SpringBootApplication&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.scheduling.annotation.EnableScheduling&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@SpringBootApplication&lt;/span&gt;
&lt;span class="nd"&gt;@EnableScheduling&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FinancialAggregatorApplication&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;SpringApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;FinancialAggregatorApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this in place, our app will start streaming simulated stock data to Kafka as soon as we run it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Perform calculations in the stream processor pipeline
&lt;/h2&gt;

&lt;p&gt;The stream processor allows us to consume the stock price data flowing through Kafka, calculate real-time analytics, and persist our enriched results to MongoDB.&lt;/p&gt;

&lt;p&gt;We'll use this to calculate some &lt;a href="https://www.investopedia.com/terms/r/rsi.asp" rel="noopener noreferrer"&gt;Relative Strength Indexes (RSIs)&lt;/a&gt;. The RSI is a momentum indicator used in financial analysis. RSI measures the speed and magnitude of a security's recent price changes to detect overbought or oversold conditions in the price of that security. It is an oscillatory metric (displayed as a line graph) on a scale of 0 to 100. We can add them to our documents before we store them in MongoDB.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fml2iegi60llytfg4mq26.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fml2iegi60llytfg4mq26.png" alt="Live stock data graph" width="800" height="445"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Using &lt;code&gt;MongoRepository&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;To save our live stock data into MongoDB, we’ll use Spring Data’s &lt;code&gt;MongoRepository&lt;/code&gt; interface. This gives us a ready-to-go set of CRUD operations with minimal setup. No boilerplate, just clean data access.&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;repository&lt;/code&gt; package and add a new interface called &lt;code&gt;LiveDataRepository&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.LiveStockData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;LiveDataRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we're extending &lt;code&gt;MongoRepository&lt;/code&gt;, which is a Spring Data interface that automatically provides methods like &lt;code&gt;save()&lt;/code&gt;, &lt;code&gt;findAll()&lt;/code&gt;, &lt;code&gt;deleteById()&lt;/code&gt;, and more.&lt;/p&gt;

&lt;p&gt;We specify &lt;code&gt;LiveStockData&lt;/code&gt; as the type we’re working with, and &lt;code&gt;ObjectId&lt;/code&gt; as the ID type (since that’s what we used in our model).&lt;/p&gt;

&lt;p&gt;We can also define custom query methods here later, if needed (e.g., &lt;code&gt;findByCompany(String company)&lt;/code&gt;), but for now, the built-in methods are more than enough.&lt;/p&gt;

&lt;h3&gt;
  
  
  The RSI calculator
&lt;/h3&gt;

&lt;p&gt;In the &lt;code&gt;Utility&lt;/code&gt; package, add a new class called &lt;code&gt;RsiCalculator&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.utility&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RsiCalculator&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;calculate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;gain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;period&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;gain&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;diff&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
            &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;diff&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loss&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gain&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;100.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, you can see that we take a rolling window of prices and compute RSI using the standard formula. If there's no loss (i.e., price only went up), the RSI maxes out at 100. Otherwise, it calculates the average gain/loss ratio over the given period.&lt;/p&gt;

&lt;p&gt;Now, in our &lt;code&gt;service&lt;/code&gt; package, we'll add a class &lt;code&gt;RsiStreamProcessor&lt;/code&gt;. Here, we will define our hook into the Kafka Streams pipeline, apply the RSI calculation, and persist the result to MongoDB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.LiveStockData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository.LiveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.utility.LiveStockDataSerde&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.utility.RsiCalculator&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.common.serialization.Serdes&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.streams.StreamsBuilder&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.streams.kstream.Consumed&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.kafka.streams.kstream.KStream&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.beans.factory.annotation.Autowired&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Component&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.ArrayDeque&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.ArrayList&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Deque&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Map&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.ConcurrentHashMap&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Component&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RsiStreamProcessor&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;RSI_PERIOD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;LiveDataRepository&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Deque&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;priceHistory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ConcurrentHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;RsiStreamProcessor&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;LiveDataRepository&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;StreamsBuilder&lt;/span&gt; &lt;span class="n"&gt;streamsBuilder&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;liveDataRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="n"&gt;buildPipeline&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;streamsBuilder&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;buildPipeline&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StreamsBuilder&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;LiveStockDataSerde&lt;/span&gt; &lt;span class="n"&gt;liveSerde&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LiveStockDataSerde&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="nc"&gt;KStream&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="s"&gt;"stock-prices"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="nc"&gt;Consumed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Serdes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;String&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;liveSerde&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;peek&lt;/span&gt;&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Received from Kafka: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCloseLast&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCompany&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
                &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
            &lt;span class="o"&gt;}&lt;/span&gt;  

            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;symbol&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCompany&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
            &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;close&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCloseLast&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

            &lt;span class="nc"&gt;Deque&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;prices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;priceHistory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;computeIfAbsent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayDeque&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;());&lt;/span&gt;  
            &lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addLast&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;RSI_PERIOD&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;removeFirst&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;RSI_PERIOD&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
                &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;rsi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RsiCalculator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;calculate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;prices&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="no"&gt;RSI_PERIOD&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
                &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setRsi&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rsi&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="o"&gt;}&lt;/span&gt;  

            &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;});&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see in the &lt;code&gt;buildPipeline&lt;/code&gt; we first validate the data. Then, we track a sliding window of the last 15 prices. If we have enough data (at least 15 days to look through), we compute the RSI and attach it to the stock object. At the end of all of this, we can then save our data into our database with the RSI calculated.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;@Component&lt;/code&gt; annotation tells Spring this is a managed bean and should be picked up during component scanning.&lt;/p&gt;

&lt;p&gt;This gives us a fully reactive stream: Every second, simulated data flows in, RSI is computed in real time, and the result is persisted—all without writing a line of boilerplate consumer logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Serve up our data in an API
&lt;/h3&gt;

&lt;p&gt;We have our data processed and funnelled into our database. Let's look at getting some of that data out.&lt;/p&gt;

&lt;p&gt;We’ll expose a REST endpoint so we (or our front end) can query the latest stock data and indicators like RSI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgptyexsicpkos8d6wl2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgptyexsicpkos8d6wl2.png" alt="Live stock data graph" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In our &lt;code&gt;service&lt;/code&gt; package, let’s create a new class, &lt;code&gt;FinancialService&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository.LiveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FinancialService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;LiveDataRepository&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;FinancialService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;LiveDataRepository&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;liveDataRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; 
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getLiveData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing fancy—we inject the &lt;code&gt;LiveDataRepository&lt;/code&gt; and expose a &lt;code&gt;getLiveData()&lt;/code&gt; method that fetches everything from the &lt;code&gt;Live_Data&lt;/code&gt; collection using &lt;code&gt;findAll()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In a new package, &lt;code&gt;controller&lt;/code&gt;, create a class called &lt;code&gt;FinancialController&lt;/code&gt;. This is the part of the app that receives HTTP requests and sends responses back.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.controller&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.LiveStockData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;   
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.service.FinancialService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@CrossOrigin&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;origins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"http://localhost:5173"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// Adjust if needed  &lt;/span&gt;
&lt;span class="nd"&gt;@RestController&lt;/span&gt;  
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/stocks"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FinancialController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;FinancialService&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;FinancialController&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;FinancialService&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;financialService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="c1"&gt;// test  &lt;/span&gt;
    &lt;span class="c1"&gt;// GET http://localhost:8080/stocks/live    &lt;/span&gt;
    &lt;span class="c1"&gt;// curl -v "http://localhost:8080/stocks/live"    &lt;/span&gt;
    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/live"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;produces&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"application/json"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getLiveData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLiveData&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This controller:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maps the &lt;code&gt;/stocks&lt;/code&gt; route to this controller.
&lt;/li&gt;
&lt;li&gt;Exposes a &lt;code&gt;/stocks/live&lt;/code&gt; endpoint that returns a JSON list of all stored &lt;code&gt;LiveStockData&lt;/code&gt; documents.
&lt;/li&gt;
&lt;li&gt;Uses &lt;code&gt;@CrossOrigin&lt;/code&gt; to allow local frontend dev (like our React/Vite on port 5173) to make requests during development.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'll be back here later to add more endpoints. Now that we have our live data being fed to our database and presented in our API, it's time to look at some of the aggregations we can do on data in our database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Perform aggregations on our historical data
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Our stock market data model
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.annotation.Id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.TimeSeries&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Historical_Data"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="nd"&gt;@TimeSeries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Historical_Data"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeField&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Date"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metaField&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Company"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StockMarketData&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@Id&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;company&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;closeLast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;open&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;low&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;volume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;high&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;company&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;closeLast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;open&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;low&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;volume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;high&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="nf"&gt;getDate&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setDate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getCompany&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCompany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;company&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getCloseLast&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCloseLast&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;closeLast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getOpen&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setOpen&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;open&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getLow&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setLow&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;low&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getVolume&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setVolume&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;volume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getHigh&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setHigh&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;high&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Read in historical data
&lt;/h2&gt;

&lt;p&gt;In addition to our real-time stream, we’ll bring in some historical stock data for context and longer-term trend analysis, showing off a few of MongoDB's aggregations specifically for analysing trends in financial data.&lt;/p&gt;

&lt;p&gt;We’re using the &lt;a href="https://www.kaggle.com/datasets/khushipitroda/stock-market-historical-data-of-top-10-companies" rel="noopener noreferrer"&gt;Stock Market: Historical Data of Top 10 Companies&lt;/a&gt; dataset from Kaggle. It contains 25,161 rows, each representing stock market activity for a major company on a specific date between July 2013 and July 2023. The dataset includes daily open, high, low, close prices, and trading volumes for companies like Apple, Starbucks, Microsoft, Cisco, Qualcomm, Meta, Amazon, Tesla, AMD, and Netflix.&lt;/p&gt;

&lt;p&gt;Download the dataset and place it in your project’s &lt;code&gt;resources&lt;/code&gt; directory with the filename &lt;code&gt;data.csv&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a repository for historical data
&lt;/h3&gt;

&lt;p&gt;To load this data into MongoDB, we’ll use a &lt;code&gt;MongoRepository&lt;/code&gt; just like we did for live data.&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;repository&lt;/code&gt; package and add an interface called &lt;code&gt;FinancialRepository&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.StockMarketData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.types.ObjectId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;FinancialRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;findAllByCompanyAndDateBetween&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This interface gives us access to &lt;code&gt;save()&lt;/code&gt; and &lt;code&gt;saveAll()&lt;/code&gt; for inserting data, and we’ve also added a custom method—&lt;br&gt;&lt;br&gt;
&lt;code&gt;findAllByCompanyAndDateBetween(...)&lt;/code&gt;—which Spring Data will automatically implement for us. This will be handy when calculating things like SMA and EMA over a time range.&lt;/p&gt;
&lt;h3&gt;
  
  
  Reading and importing the CSV
&lt;/h3&gt;

&lt;p&gt;Now, let’s read the historical CSV and insert it into MongoDB.&lt;/p&gt;

&lt;p&gt;Create a class called &lt;code&gt;ReadInHistoricalData&lt;/code&gt; in your &lt;code&gt;service&lt;/code&gt; package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.StockMarketData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository.FinancialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.ApplicationRunner&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.core.io.ClassPathResource&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.io.BufferedReader&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.io.InputStreamReader&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.text.ParseException&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.text.SimpleDateFormat&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ReadInHistoricalData&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;ApplicationRunner&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;FinancialRepository&lt;/span&gt; &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;BATCH_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Insert in chunks for performance&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;ReadInHistoricalData&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;FinancialRepository&lt;/span&gt; &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;financialRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Override&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;springframework&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;boot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ApplicationArguments&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;loadData&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;loadData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;fileName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"data.csv"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BufferedReader&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BufferedReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InputStreamReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ClassPathResource&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fileName&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getInputStream&lt;/span&gt;&lt;span class="o"&gt;())))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

            &lt;span class="c1"&gt;// Skip header&lt;/span&gt;
            &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readLine&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readLine&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;StockMarketData&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parseCsvLine&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;BATCH_SIZE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;saveAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                    &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;clear&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="c1"&gt;// Save any remaining data&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;saveAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Data import completed successfully."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error loading CSV: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;StockMarketData&lt;/span&gt; &lt;span class="nf"&gt;parseCsvLine&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;","&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Skipping invalid line: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="na"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parseDate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="na"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parseDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;
            &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parseDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;
            &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parseDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;
            &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parseDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;
            &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parseDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Skipping row due to invalid date: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closeLast&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error parsing line: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="nf"&gt;parseDate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;dateStr&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;dateFormats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"MM/dd/yyyy"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"MM-dd-yyyy"&lt;/span&gt;&lt;span class="o"&gt;};&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;dateFormats&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;SimpleDateFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Locale&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;US&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;parse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dateStr&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ParseException&lt;/span&gt; &lt;span class="n"&gt;ignored&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Unrecognized date format: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;dateStr&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;parseDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parseDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;replace&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;NumberFormatException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This does a few important things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It uses &lt;code&gt;ApplicationRunner&lt;/code&gt; to run once at startup and load the data (which we’ll remove later).  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It reads the &lt;code&gt;data.csv&lt;/code&gt; file from &lt;code&gt;resources&lt;/code&gt; using a buffered reader.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It parses and batches records into chunks of 500 for efficient writes.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It handles some common data parsing issues (like dollar signs and inconsistent date formats).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Tip: One-time data load
&lt;/h3&gt;

&lt;p&gt;Once the data is successfully loaded into MongoDB, we can prevent it from running again on every startup by removing the &lt;code&gt;implements ApplicationRunner&lt;/code&gt; line and deleting this method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Override&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;springframework&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;boot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ApplicationArguments&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;loadData&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will leave us with a reusable &lt;code&gt;loadData()&lt;/code&gt; method we can call manually if needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Custom financial repository
&lt;/h2&gt;

&lt;p&gt;We already have a basic repository for loading data. But now we want to calculate things like moving averages, and that’s where things get a little more advanced. Instead of fetching raw documents and calculating these averages manually in memory in Java, we’re going to lean on MongoDB’s aggregation framework to do the heavy lifting for us.&lt;/p&gt;

&lt;p&gt;Let’s start by creating a few records in our &lt;code&gt;model&lt;/code&gt; package.&lt;/p&gt;

&lt;p&gt;First, we'll define a simple record for the company names. Create a record called &lt;code&gt;CompanyDTO&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;CompanyDTO&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We'll use this to retrieve a list of unique company names for our front end, that we can use to filter our stock information for graphing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3spc68vq8hcvayg0cg58.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3spc68vq8hcvayg0cg58.png" alt="Company select drop down" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, we need a record called &lt;code&gt;SimpleMovingAverageDTO&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;sma&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lastly, let’s define a DTO first called &lt;code&gt;ExponentialMovingAverageDTO&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;ema&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Up next, we'll create our interface in the &lt;code&gt;repository&lt;/code&gt; package called &lt;code&gt;CustomFinancialRepository&lt;/code&gt;, where we'll define our methods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.CompanyDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;CustomFinancialRepository&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;CompanyDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getSimpleMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getExponentialMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we’ll set up our custom repository implementation. This is where we’ll drop in our aggregation logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.CompanyDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.domain.Sort&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.MongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.aggregation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.query.Criteria&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;static&lt;/span&gt; &lt;span class="n"&gt;org&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;springframework&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;core&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.*;&lt;/span&gt;  

&lt;span class="nd"&gt;@Repository&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CustomFinancialRepositoryImplementation&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;CustomFinancialRepository&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;EXPONENTIAL_MOVING_AVERAGE_PERIOD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;SIMPLE_MOVING_AVERAGE_PERIOD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoTemplate&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;CustomFinancialRepositoryImplementation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoTemplate&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mongoTemplate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@Override&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;CompanyDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;getListOfUniqueCompanies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"company"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;first&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"company"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;by&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;getListOfUniqueCompanies&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Historical_Data"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;CompanyDTO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;  
        &lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;getAllCompanies()&lt;/code&gt; method grabs all unique companies in our &lt;code&gt;Historical_Data&lt;/code&gt; collection. We’re grouping by the &lt;code&gt;"company"&lt;/code&gt; field and grabbing the first one per group. It’s a clean way to get a list of available stock symbols to work with.&lt;/p&gt;

&lt;p&gt;We also define some values we'll use for windows when calculating our moving averages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;EXPONENTIAL_MOVING_AVERAGE_PERIOD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;SIMPLE_MOVING_AVERAGE_PERIOD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let’s move into the actual indicators.&lt;/p&gt;

&lt;h2&gt;
  
  
  Simple Moving Average
&lt;/h2&gt;

&lt;p&gt;A Simple Moving Average (SMA) is what it sounds like. It takes the average closing price over a rolling window. This smooths out daily fluctuations and helps us see the trend over time.&lt;/p&gt;

&lt;p&gt;Now, the actual implementation. We will add the following method to our &lt;code&gt;CustomFinancialRepositoryImplementation&lt;/code&gt; implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@Override&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getSimpleMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Criteria&lt;/span&gt; &lt;span class="n"&gt;criteria&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Criteria&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"company"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;is&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;lte&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="c1"&gt;// Define SMA calculation (5-day moving average)  &lt;/span&gt;
        &lt;span class="nc"&gt;SetWindowFieldsOperation&lt;/span&gt; &lt;span class="n"&gt;smaOperation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SetWindowFieldsOperation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;partitionBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"company"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// Ensures calculations are per company  &lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sortBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;by&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Direction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ASC&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;// Sort ascending by Date  &lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AccumulatorOperators&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Avg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;avgOf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"closeLast"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;// Compute SMA using avg()  &lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;within&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SetWindowFieldsOperation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Windows&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;(-&lt;/span&gt;&lt;span class="no"&gt;SIMPLE_MOVING_AVERAGE_PERIOD&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;// Defines a 5-day window (4 previous + current)  &lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"sma"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// Output field named "sma"  &lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;aggregation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;criteria&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="n"&gt;smaOperation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"sma"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// Keep only SMA and Date fields  &lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Historical_Data"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;  
        &lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’re using MongoDB’s &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/setWindowFields/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Financial+Aggregator+Spring+Kafka&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;&lt;code&gt;$setWindowFields&lt;/code&gt;&lt;/a&gt; operator here. This allows us to do rolling window calculations directly inside the database. We’re asking it to average the &lt;code&gt;closeLast&lt;/code&gt; value for each document, across a sliding five-day window (the current doc plus the four before it), for a single company.&lt;/p&gt;

&lt;p&gt;The result? We get a list of dates with a corresponding &lt;code&gt;sma&lt;/code&gt; value.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw09puqhzcvtl4z986978.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw09puqhzcvtl4z986978.png" alt="Simple Moving Average graph" width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Exponential Moving Average
&lt;/h2&gt;

&lt;p&gt;The Exponential Moving Average is similar to the SMA, but instead of giving every day equal weight, it weights recent days more heavily. This makes the EMA a bit more reactive to recent price movement, which traders often prefer.&lt;/p&gt;

&lt;p&gt;And now, let’s add the method that calculates it to our &lt;code&gt;CustomFinancialRepositoryImplementation&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@Override&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getExponentialMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Criteria&lt;/span&gt; &lt;span class="n"&gt;criteria&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Criteria&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"company"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;is&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;lte&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="nc"&gt;SetWindowFieldsOperation&lt;/span&gt; &lt;span class="n"&gt;emaOperation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SetWindowFieldsOperation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;partitionBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"company"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sortBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;by&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Direction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ASC&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AccumulatorOperators&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ExpMovingAvg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expMovingAvgOf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"closeLast"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;EXPONENTIAL_MOVING_AVERAGE_PERIOD&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ema"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;aggregation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;criteria&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="n"&gt;emaOperation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ema"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Historical_Data"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;  
        &lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is almost identical to the SMA implementation, except we swap out &lt;code&gt;$avgOf&lt;/code&gt; for &lt;a href="https://www.mongodb.com/docs/manual/reference/operator/aggregation/expMovingAvg/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Financial+Aggregator+Spring+Kafka&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;&lt;code&gt;$expMovingAvgOf&lt;/code&gt;&lt;/a&gt;, and give it a period (&lt;code&gt;n&lt;/code&gt;) to define how far back to weigh things. MongoDB handles the weighting math, so we don’t need to.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiu0wuz0p2rj5d5kgqk6c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiu0wuz0p2rj5d5kgqk6c.png" alt="Exponential Moving Average graph" width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Let’s test it
&lt;/h2&gt;

&lt;p&gt;Now, with the aggregations done, all we need is a service to expose these methods to our controllers. This is the part of the app that acts as a go-between: It calls our repositories, applies any business logic (if needed), and returns results to the controller so they can be exposed through our API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Update our FinancialService class
&lt;/h3&gt;

&lt;p&gt;Here’s the full and final &lt;code&gt;FinancialService&lt;/code&gt; class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository.CustomFinancialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository.FinancialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.repository.LiveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FinancialService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;FinancialRepository&lt;/span&gt; &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;CustomFinancialRepository&lt;/span&gt; &lt;span class="n"&gt;customFinancialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;LiveDataRepository&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;FinancialService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;CustomFinancialRepository&lt;/span&gt; &lt;span class="n"&gt;customFinancialRepository&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                            &lt;span class="nc"&gt;FinancialRepository&lt;/span&gt; &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                            &lt;span class="nc"&gt;LiveDataRepository&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;customFinancialRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customFinancialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;financialRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;liveDataRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;CompanyDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;companiesList&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customFinancialRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;companiesList&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;CompanyDTO:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getStockDataBetweenDates&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAllByCompanyAndDateBetween&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getSimpleMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;companyName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;customFinancialRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSimpleMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;companyName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getExponentialMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;companyName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;customFinancialRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getExponentialMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;companyName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getLiveData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;liveDataRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've added our new repositories to the constructor, so we can access the methods. We've also called the methods we created in these repositories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;getAllCompanies()&lt;/code&gt; fetches a list of unique company names from our historical dataset.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;getStockDataBetweenDates(...)&lt;/code&gt; pulls raw historical records between two dates, no aggregations applied.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;getSimpleMovingAverage(...)&lt;/code&gt; and &lt;code&gt;getExponentialMovingAverage(...)&lt;/code&gt; call the aggregation pipelines we built in the custom repository.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Adding our endpoints
&lt;/h3&gt;

&lt;p&gt;Let's go back to the &lt;code&gt;FinancialController&lt;/code&gt; we built earlier and add some endpoints for our new methods.&lt;/p&gt;

&lt;p&gt;This controller will expose everything we’ve been working on—historical data queries, technical indicators, and even our live Kafka-driven data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.controller&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.LiveStockData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.model.StockMarketData&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.financialaggregator.service.FinancialService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.format.annotation.DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.web.bind.annotation.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Date&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@CrossOrigin&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;origins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"http://localhost:5173"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// Adjust if needed  &lt;/span&gt;
&lt;span class="nd"&gt;@RestController&lt;/span&gt;  
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/stocks"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FinancialController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;FinancialService&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;FinancialController&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;FinancialService&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;financialService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="c1"&gt;// test  &lt;/span&gt;
    &lt;span class="c1"&gt;// GET http://localhost:8080/stocks/companies    // curl -v "http://localhost:8080/stocks/companies"    @GetMapping(value = "/companies", produces = "application/json")  &lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="c1"&gt;// test  &lt;/span&gt;
    &lt;span class="c1"&gt;// GET http://localhost:8080/stocks/companyStocksBetween?company=AAPL&amp;amp;startDate=2022-17-10&amp;amp;endDate=2023-07-17    // curl -v "http://localhost:8080/stocks/companyStocksBetween?company=AAPL&amp;amp;startDate=2022-07-17&amp;amp;endDate=2023-07-17"    @GetMapping(value = "/companyStocksBetween", produces = "application/json")  &lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getStockDataBetweenDates&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getStockDataBetweenDates&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  

    &lt;span class="c1"&gt;// test  &lt;/span&gt;
    &lt;span class="c1"&gt;// GET http://localhost:8080/stocks/sma?company=AAPL&amp;amp;startDate=2023-07-10&amp;amp;endDate=2023-07-17    // curl -v "http://localhost:8080/stocks/sma?company=AAPL&amp;amp;startDate=2022-07-10&amp;amp;endDate=2023-07-17"    @GetMapping(value = "/sma", produces = "application/json")  &lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getSmaForCompany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSimpleMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  

    &lt;span class="c1"&gt;// test  &lt;/span&gt;
    &lt;span class="c1"&gt;// GET http://localhost:8080/stocks/ema?company=AAPL&amp;amp;startDate=2023-07-10&amp;amp;endDate=2023-07-17    // curl -v "http://localhost:8080/stocks/sma?company=AAPL&amp;amp;startDate=2023-07-10&amp;amp;endDate=2023-07-17"    @GetMapping(value= "/ema", produces = "application/json")  &lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getEmaForCompany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getExponentialMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  

    &lt;span class="c1"&gt;// test  &lt;/span&gt;
    &lt;span class="c1"&gt;// GET http://localhost:8080/stocks/live    // curl -v "http://localhost:8080/stocks/live"    @GetMapping(value= "/live", produces = "application/json")  &lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;LiveStockData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getLiveData&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLiveData&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every method here maps directly to one in &lt;code&gt;FinancialService&lt;/code&gt;. We keep this controller thin—just parsing request parameters, calling the service, and returning the result.&lt;/p&gt;

&lt;p&gt;Let's walk through the new endpoints we just added.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAllCompanies&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This endpoint returns a list of all the unique companies in our historical dataset.&lt;/p&gt;

&lt;p&gt;Behind the scenes, the &lt;code&gt;FinancialService&lt;/code&gt; calls a MongoDB aggregation that groups documents by the &lt;code&gt;company&lt;/code&gt; field, sorts them alphabetically, and maps them into a &lt;code&gt;CompanyDTO&lt;/code&gt;. This method simply extracts the &lt;code&gt;.name&lt;/code&gt; field from each of those records and returns the result as a &lt;code&gt;List&amp;lt;String&amp;gt;&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StockMarketData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getStockDataBetweenDates&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getStockDataBetweenDates&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one gives us the raw historical stock data for a given company between two dates.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;@RequestParam&lt;/code&gt; annotations pull values straight from the query string.
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;@DateTimeFormat(iso = DateTimeFormat.ISO.DATE)&lt;/code&gt; ensures that Spring can parse the date strings correctly, e.g., &lt;code&gt;2023-01-01&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;FinancialService&lt;/code&gt; calls a Spring Data repository method that uses a MongoDB query filter: &lt;code&gt;company == ...&lt;/code&gt; and &lt;code&gt;date &amp;gt;= startDate &amp;amp;&amp;amp; &amp;lt;= endDate&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'll use this to populate the candlestick graph in the front end.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsdo95wqxhy60cu23mnh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsdo95wqxhy60cu23mnh.png" alt="Candlestick graph for historical data" width="800" height="390"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SimpleMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getSmaForCompany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSimpleMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This endpoint returns a Simple Moving Average for a given company over a specific date range.&lt;/p&gt;

&lt;p&gt;This method uses the MongoDB aggregation pipeline we set up to compute a rolling average of the &lt;code&gt;closeLast&lt;/code&gt; field, and returns an array we can plot as a line graph.&lt;/p&gt;

&lt;p&gt;Each record in the response includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;sma&lt;/code&gt; value (the average).
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;date&lt;/code&gt; it corresponds to.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ExponentialMovingAverageDTO&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getEmaForCompany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nd"&gt;@DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iso&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DateTimeFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ISO&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DATE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;financialService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getExponentialMovingAverage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startDate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;endDate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;};&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one’s similar to the SMA endpoint but calculates an EMA instead.&lt;/p&gt;

&lt;p&gt;The key difference is that recent values are weighted more heavily, but it still returns that array we can plot as a line graph.&lt;/p&gt;

&lt;p&gt;This is useful for traders looking for momentum shifts or for graphing smoother, more sensitive trendlines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Running our application
&lt;/h3&gt;

&lt;p&gt;If you’re using Kafka 3.x with KRaft mode (no ZooKeeper), you must initialize your Kafka log directory with &lt;code&gt;kafka-storage.sh format&lt;/code&gt; before starting the server. This only needs to be done once per broker.&lt;/p&gt;

&lt;p&gt;Feel free to check a full detailed walkthrough on &lt;a href="https://kafka.apache.org/quickstart" rel="noopener noreferrer"&gt;getting started with Kafka&lt;/a&gt;, but we’ll just do what we need to get started.&lt;/p&gt;

&lt;p&gt;Choose any UUID for your cluster (or generate one). You can generate one like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CLUSTER_ID=$(bin/kafka-storage.sh random-uuid) echo $CLUSTER_ID
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this command to format the KRaft storage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bin/kafka-storage.sh format -t $CLUSTER_ID -c config/kraft/server.properties
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This writes the metadata to the log dirs and makes the broker startable.&lt;/p&gt;

&lt;p&gt;Now, start the Kafka server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bin/kafka-server-start.sh config/kraft/server.properties
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see logs like: &lt;code&gt;INFO [KafkaServer id=0] started (kafka.server.KafkaServer)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can now run your Spring application, and after we give some time for everything to start up, we should start to see some of our logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Received from Kafka: com.mongodb.financialaggregator.model.LiveStockData@2b0aaf9c
========== STOCK DEBUG INFO ==========
Company:       DOOF
Date (raw):    Sun Jun 22 00:00:00 IST 2025
Date (class):  java.util.Date
Open:          111.12
Close/Last:    115.26
High:          123.71
Low:           102.81
Volume:        1054.14
======================================
Sent successfully.

Received from Kafka: com.mongodb.financialaggregator.model.LiveStockData@398892bb
========== STOCK DEBUG INFO ==========
Company:       DOOF
Date (raw):    Mon Jun 23 00:00:00 IST 2025
Date (class):  java.util.Date
Open:          116.53
Close/Last:    111.57
High:          127.41
Low:           106.09
Volume:        1282.71
======================================
Sent successfully.

Received from Kafka: com.mongodb.financialaggregator.model.LiveStockData@31ca10ba
========== STOCK DEBUG INFO ==========
Company:       DOOF
Date (raw):    Tue Jun 24 00:00:00 IST 2025
Date (class):  java.util.Date
Open:          112.77
Close/Last:    117.03
High:          128.63
Low:           106.69
Volume:        1418.38
======================================
Sent successfully.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we can verify this in our MongoDB collection. Our &lt;code&gt;Historical_Data&lt;/code&gt; collection will take a little longer to populate, but it shouldn't be more than a few seconds until we start to see data in our &lt;code&gt;Live_Data&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If we want to test out our endpoints, spin up the React front end I provided in the GitHub repo that allows us to look at our data and aggregations, graphed.&lt;/p&gt;

&lt;p&gt;If you want to test out the endpoints manually, we can run these curl commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# All companies
curl -v http://localhost:8080/stocks/companies

# Historical data for AAPL
curl -v "http://localhost:8080/stocks/companyStocksBetween?company=AAPL&amp;amp;startDate=2023-01-01&amp;amp;endDate=2023-02-01"

# SMA
curl -v "http://localhost:8080/stocks/sma?company=AAPL&amp;amp;startDate=2023-01-01&amp;amp;endDate=2023-02-01"

# EMA
curl -v "http://localhost:8080/stocks/ema?company=AAPL&amp;amp;startDate=2023-01-01&amp;amp;endDate=2023-02-01"

# Live data
curl -v http://localhost:8080/stocks/live

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This isn’t just another toy project pretending to be real-time. We’ve built a proper streaming architecture—one that ingests live data, enriches it on the fly with RSI, persists it with zero ceremony using Spring Data MongoDB, and exposes it through a clean, RESTful interface. On the historical side, we’ve leaned on MongoDB’s aggregation framework to calculate indicators like SMA and EMA directly in the database, avoiding the overhead of processing everything in memory or shipping it off to a separate analytics engine.&lt;/p&gt;

&lt;p&gt;If you want to learn more about what can be done with MongoDB and Kafka, check out my tutorial, &lt;a href="https://dev.to/mongodb/building-a-real-time-ai-fraud-detection-system-with-spring-kafka-and-mongodb-2jbn"&gt;Building a Real-Time AI Fraud Detection System with Spring Kafka and MongoDB&lt;/a&gt;, or learn how to create an app in &lt;a href="https://medium.com/@MongoDB/demand-forecasting-with-mongodb-time-series-and-edge-server-198aa5a5511e" rel="noopener noreferrer"&gt;Demand Forecasting With MongoDB Time Series and Edge-Server&lt;/a&gt; over on &lt;a href="https://medium.com/@MongoDB" rel="noopener noreferrer"&gt;Medium&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>kafka</category>
      <category>timeseries</category>
      <category>java</category>
    </item>
    <item>
      <title>Building a Real-Time AI Fraud Detection System with Spring Kafka and MongoDB</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Wed, 16 Apr 2025 14:00:00 +0000</pubDate>
      <link>https://forem.com/mongodb/building-a-real-time-ai-fraud-detection-system-with-spring-kafka-and-mongodb-2jbn</link>
      <guid>https://forem.com/mongodb/building-a-real-time-ai-fraud-detection-system-with-spring-kafka-and-mongodb-2jbn</guid>
      <description>&lt;p&gt;In this tutorial, we'll build a real-time fraud detection system using MongoDB Atlas Vector Search, Apache Kafka, and AI-generated embeddings. We'll demonstrate how MongoDB Atlas Vector Search can be used to detect anomalies in a stream of financial transactions by analyzing a user's transaction history and identifying suspicious behavior based on LLM-generated embeddings.&lt;/p&gt;

&lt;p&gt;Our solution will monitor MongoDB Change Streams using the Java synchronous driver, triggering vector searches on each new transaction to detect potential fraud. While this approach works well for our demo, and for many use cases, we’ll also discuss its limitations. Throughout the tutorial, I’ll cover alternative strategies to optimize performance, whether you need higher transaction throughput, faster fraud detection, or a more scalable architecture.&lt;/p&gt;

&lt;p&gt;If you want to view the completed code for this tutorial, you can find it on &lt;a href="https://github.com/mongodb-developer/frauddetector" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we are building
&lt;/h2&gt;

&lt;p&gt;Our real-time fraud detection pipeline will work like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generate customer profiles: Create synthetic users with predictable spending habits and store them in MongoDB.
&lt;/li&gt;
&lt;li&gt;Produce transactions: A Kafka producer generates transactions, computes AI embeddings, and sends them to Kafka.
&lt;/li&gt;
&lt;li&gt;Consume transactions: A Kafka consumer reads transactions from Kafka and records them in MongoDB.
&lt;/li&gt;
&lt;li&gt;Monitor in real-time: The application listens to MongoDB Change Streams to detect new transactions as they arrive.
&lt;/li&gt;
&lt;li&gt;Run vector search: Each new transaction is compared against the user’s historical transactions using vector search.
&lt;/li&gt;
&lt;li&gt;Detect fraud: A transaction is marked as fraud if:

&lt;ul&gt;
&lt;li&gt;No similar transactions exist for that user.
&lt;/li&gt;
&lt;li&gt;Any of the returned similar transactions are already flagged as fraud.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyt0tu6z0ngo9jtcgv0bu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyt0tu6z0ngo9jtcgv0bu.png" alt="Diagram showing the architecture of the app" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This pipeline ensures real-time anomaly detection using AI-powered embeddings and vector search.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along, you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A MongoDB Atlas account with a free-forever M0 (or higher) &lt;a href="https://www.mongodb.com/docs/guides/atlas/cluster/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;cluster set up&lt;/a&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://kafka.apache.org/downloads" rel="noopener noreferrer"&gt;Kafka&lt;/a&gt; 3.5+ (local installation or Docker).
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.java.com/download/ie_manual.jsp" rel="noopener noreferrer"&gt;Java 21&lt;/a&gt; set up on your machine.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://platform.openai.com/docs/overview" rel="noopener noreferrer"&gt;OpenAI API key&lt;/a&gt; (for embeddings).
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://maven.apache.org/download.cgi" rel="noopener noreferrer"&gt;Maven 3.9+&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Create our MongoDB database
&lt;/h2&gt;

&lt;p&gt;Log into MongoDB Atlas and create a free-forever M0 cluster. Create a new database called &lt;code&gt;fraud&lt;/code&gt;. In this database, we need to create two collections.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;customers&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;transactions&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For this simple demo, we are just going to create basic collections because, as of writing this, time series collections do not support search indexes. If you want to learn more about setting these up for your Java application, check out our tutorial &lt;a href="https://www.mongodb.com/developer/products/mongodb/time-series-data-java/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Handle Time Series Data with MongoDB&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create a Vector Search index
&lt;/h2&gt;

&lt;p&gt;We are going to create a &lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/?tck=docs_server_toc&amp;amp;utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;vector search index&lt;/a&gt; on our &lt;code&gt;transactions&lt;/code&gt; collection. This will allow us to find semantically similar documents, to help us identify anomalies in our customer transactions, and mark these as fraud.&lt;/p&gt;

&lt;p&gt;How does this work? Well, we create &lt;a href="https://www.mongodb.com/resources/basics/vector-embeddings/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;vector embeddings&lt;/a&gt; (numerical representations of our data) of our transactions, to store in our vector store, the transactions collection. We get these by sending the information from our transactions to an embedding model, like OpenAI's &lt;code&gt;text-embedding-3-small&lt;/code&gt;. These embeddings look like an array of floating point numbers, e.g., [2.0457, -3.1417895, ...].&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe8s3g18qpfszyn1c7he5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe8s3g18qpfszyn1c7he5.png" alt="Diagram showing how vector embeddings are created" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We then use MongoDB's Atlas Vector Search, which uses a nearest neighbour algorithm to decide what transactions are semantically similar. These transactions are returned, and if the new transaction does not match any past transactions for that user, or if it matches transactions marked fraud, it could be an anomaly (potential fraud).&lt;/p&gt;

&lt;p&gt;You can learn the in-depth steps to set up a Vector Search index in &lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/tutorials/vector-search-quick-start/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Atlas Vector Search Quick Start&lt;/a&gt;, but briefly:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Navigate to MongoDB Atlas → Collections → Click &lt;code&gt;transactions&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;Select → Atlas Search → Create Search Index + Vector Search.
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use the following JSON:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fields&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;vector&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;path&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;embedding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;numDimensions&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;similarity&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dotProduct&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;code&gt;create index&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We are using 1536 for our number of dimensions because of the embedding model we will be using. OpenAI provides the &lt;code&gt;text-embedding-3-small&lt;/code&gt; for embedding (among others). This is a general purpose model embedding text, but using a model tailored for financial fraud detection, especially a custom model tailored for and trained on your data, will lead to better results. Whether you're looking for faster embeddings or more accurate predictions, you can gain some control of that here.&lt;/p&gt;

&lt;p&gt;Since OpenAI embeddings are optimized for using the dot product algorithm, that is the similarity method we will stick with. It's also worth noting that if fraudulent transactions tend to have stronger signals (e.g., very different merchants, categories, amounts), dot product helps capture that magnitude, along with the direction. Essentially, it naturally emphasises these stronger signals. In contrast, cosine similarity normalizes all vectors to unit length, effectively discarding this useful magnitude information. This means it would treat a small anomaly and a large one as equally “different” in direction, which can dilute the signal strength we care about when identifying fraud. &lt;/p&gt;

&lt;h2&gt;
  
  
  Create a Spring application
&lt;/h2&gt;

&lt;p&gt;We are going to create our application using &lt;a href="https://start.spring.io/" rel="noopener noreferrer"&gt;Spring Initializr&lt;/a&gt;. Here, we need to set our project to Maven, our language to Java, and our Spring Boot version to 3.4.2 (most recent stable release as of writing). We'll also set our app name—mine is &lt;code&gt;frauddetector&lt;/code&gt;—and for packaging, we are going to make a jar, and Java version 21.&lt;/p&gt;

&lt;p&gt;We will need a couple of packages too.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spring Web
&lt;/li&gt;
&lt;li&gt;Spring for Apache Kafka
&lt;/li&gt;
&lt;li&gt;Spring Data MongoDB
&lt;/li&gt;
&lt;li&gt;OpenAI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Download and unzip your Spring application. Open the application in the IDE of your choosing. Open the pom.xml file and add the necessary Jackson dependency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;  
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.fasterxml.jackson.core&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;  
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;jackson-databind&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;  
&lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting up configuration
&lt;/h2&gt;

&lt;p&gt;Now, open up the application and go to the &lt;code&gt;application.properties&lt;/code&gt;. Here, we'll add the various connection strings, tokens, and configuration settings for our dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  MongoDB configuration
&lt;/h3&gt;

&lt;p&gt;Spring Boot will automatically configure MongoDB, so we don’t have to manually create a &lt;code&gt;MongoClient&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;spring.application.name=frauddetector
spring.data.mongodb.uri=&lt;span class="nt"&gt;&amp;lt;YOUR_CONNECTION_STRING&amp;gt;&lt;/span&gt;
spring.data.mongodb.database=fraud
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just add your connection string and the database name and we're ready to go.&lt;/p&gt;

&lt;h3&gt;
  
  
  Spring AI configuration
&lt;/h3&gt;

&lt;p&gt;We use Spring AI with OpenAI's API to generate text embeddings for transaction similarity detection. We will use these embeddings for vector search, allowing us to compare transactions based on their semantic meaning and patterns (overall vibe), rather than raw data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;spring.ai.openai.api-key=&lt;span class="nt"&gt;&amp;lt;YOUR_OPEN_AI_API_KEY&amp;gt;&lt;/span&gt;
spring.ai.openai.embedding.options.model=text-embedding-3-small
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;text-embedding-3-small&lt;/code&gt; is a lightweight embedding model optimized for low-latency vector generation. Alternatively, you could use &lt;code&gt;"text-embedding-3-large"&lt;/code&gt; for higher accuracy but at increased cost and latency.&lt;/p&gt;

&lt;p&gt;We will also create a &lt;code&gt;Config&lt;/code&gt; package and add a &lt;code&gt;MongoDBConfig&lt;/code&gt; class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.config&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.MongoClient&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.MongoCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.MongoDatabase&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Bean&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Configuration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Configuration&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MongoDBConfig&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;DATABASE_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"fraud"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;TRANSACTIONS_COLLECTION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"transactions"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;MongoDatabase&lt;/span&gt; &lt;span class="nf"&gt;fraudDatabase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDatabase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DATABASE_NAME&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoDatabase&lt;/span&gt; &lt;span class="n"&gt;fraudDatabase&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fraudDatabase&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;TRANSACTIONS_COLLECTION&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we will add some methods to connect to our MongoDB database that we can reuse through our application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Kafka configuration
&lt;/h3&gt;

&lt;p&gt;Kafka is a distributed messaging system that allows producers to send messages and consumers to process them asynchronously. In our fraud detection system, we use Kafka to handle real-time transaction ingestion.&lt;/p&gt;

&lt;p&gt;Spring Boot provides built-in Kafka support via &lt;code&gt;spring-kafka&lt;/code&gt;, allowing us to configure Kafka producers and consumers using application properties.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;spring.kafka.bootstrap-servers=localhost:9092
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer
spring.kafka.consumer.bootstrap-servers=localhost:9092
spring.kafka.consumer.group-id=fraud-group
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.key-deserializer=org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.JsonDeserializer
spring.kafka.consumer.properties.spring.json.trusted.packages=com.mongodb.frauddetector.model
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are unfamiliar with Apache Kafka, this might look like a wall of gibberish, so let's break down what each property does&lt;/p&gt;

&lt;p&gt;Connecting to Kafka&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;spring.kafka.bootstrap-servers=localhost:9092&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Specifies the Kafka broker address (default is &lt;code&gt;localhost:9092&lt;/code&gt; if running locally).
&lt;/li&gt;
&lt;li&gt;The producer and consumer will use this to send and receive messages.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Kafka Producer configuration&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Converts the message key (used for partitioning) into a string.
&lt;/li&gt;
&lt;li&gt;We aren't explicitly setting keys, so Kafka will auto-generate them.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Serializes Java objects (transactions) into JSON before sending them to Kafka.
&lt;/li&gt;
&lt;li&gt;Required because we send complex objects like &lt;code&gt;Transaction&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Kafka Consumer configuration&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;spring.kafka.consumer.bootstrap-servers=localhost:9092&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Specifies which Kafka broker the consumer should listen to.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;spring.kafka.consumer.group-id=fraud-group&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Defines a consumer group (&lt;code&gt;fraud-group&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;Multiple consumers in this group can share the workload and balance message consumption.
&lt;/li&gt;
&lt;li&gt;If a consumer fails, another instance in the group will resume from where it left off.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;spring.kafka.consumer.auto-offset-reset=earliest&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Determines where to start consuming messages if no offset exists.
&lt;/li&gt;
&lt;li&gt;Options:
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;earliest&lt;/code&gt;: Start from the beginning of the topic (useful for debugging).
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;latest&lt;/code&gt;: Only read new messages (default behavior).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Kafka Consumer deserialization&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;spring.kafka.consumer.key-deserializer=org.apache.kafka.common.serialization.StringDeserializer&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Deserializes message keys into Strings.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.JsonDeserializer&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Deserializes Kafka messages from JSON into Java objects.
&lt;/li&gt;
&lt;li&gt;This is necessary since our transactions are sent as JSON.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;spring.kafka.consumer.properties.spring.json.trusted-packages=com.mongodb.frauddetector.model&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Ensures only trusted packages are deserialized.
&lt;/li&gt;
&lt;li&gt;Prevents security vulnerabilities from untrusted deserialization.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;How it all comes together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Kafka Producer sends transaction events as JSON to the "transactions" topic.
&lt;/li&gt;
&lt;li&gt;Kafka Broker routes these events.
&lt;/li&gt;
&lt;li&gt;Kafka Consumer reads these messages, deserializes them into &lt;code&gt;Transaction&lt;/code&gt; objects, and stores them in MongoDB.
&lt;/li&gt;
&lt;li&gt;MongoDB Change Streams detect new transactions and trigger fraud detection.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Generate our synthetic customer profiles
&lt;/h2&gt;

&lt;p&gt;To make our life simple, we're going to start by adding a few enums. We'll be using these when you're generating our transactions, and defining what is "normal" behaviour for our sample customers. Create a package &lt;code&gt;enums&lt;/code&gt; in the Spring app to store these.&lt;/p&gt;

&lt;p&gt;First, we'll create a &lt;code&gt;Category&lt;/code&gt; enum.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;enum&lt;/span&gt; &lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="no"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;TECH&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;GROCERY&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, a &lt;code&gt;Currency&lt;/code&gt; enum.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;enum&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="no"&gt;EUR&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;USD&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;GBP&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally, a slightly more exciting &lt;code&gt;Merchant&lt;/code&gt; enum.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Map&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Random&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;enum&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="c1"&gt;// Retail Merchants  &lt;/span&gt;
    &lt;span class="no"&gt;AMAZON&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;WALMART&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;BEST_BUY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;TARGET&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;COSTCO&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;ETSY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;EBAY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;IKEA&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  

    &lt;span class="c1"&gt;// Tech Merchants  &lt;/span&gt;
    &lt;span class="no"&gt;APPLE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;MICROSOFT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;GOOGLE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  

    &lt;span class="c1"&gt;// Grocery Merchants  &lt;/span&gt;
    &lt;span class="no"&gt;DUNNES_STORES&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;LIDL&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;TESCO&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt; &lt;span class="no"&gt;RANDOM&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;CATEGORY_MERCHANTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;AMAZON&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;WALMART&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;BEST_BUY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;TARGET&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;COSTCO&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;ETSY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;EBAY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;IKEA&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
            &lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TECH&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;APPLE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;MICROSOFT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;GOOGLE&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
            &lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GROCERY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DUNNES_STORES&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;LIDL&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;TESCO&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt; &lt;span class="nf"&gt;getRandomMerchant&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;merchants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;CATEGORY_MERCHANTS&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;merchants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;RANDOM&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;merchants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we have a bit of logic to map our merchants to different categories, and a function to get a random merchant, to help with our transaction generation later (for testing and demo purposes).&lt;/p&gt;

&lt;h3&gt;
  
  
  The customer model
&lt;/h3&gt;

&lt;p&gt;Our customer model is going to outline some information about our sample customers that will help us to define their spending habits. Since we’re building a fraud detection system, we need a way to recognize when a user makes a purchase that doesn't align with their usual behavior. Create a &lt;code&gt;model&lt;/code&gt; package and add a &lt;code&gt;Customer&lt;/code&gt; class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Merchant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.annotation.Id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Random&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"customers"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Customer&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="nd"&gt;@Id&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;merchants&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Trusted merchants  &lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Trusted categories  &lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;meanSpending&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;spendingStdDev&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="n"&gt;preferredCurrency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;merchants&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                    &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;meanSpending&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="n"&gt;spendingStdDev&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="n"&gt;preferredCurrency&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;merchants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;merchants&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;categories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;meanSpending&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;meanSpending&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;spendingStdDev&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;spendingStdDev&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;preferredCurrency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;preferredCurrency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getUserId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getMerchants&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;merchants&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getCategories&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getMeanSpending&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;meanSpending&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt; &lt;span class="nf"&gt;getSpendingStdDev&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;spendingStdDev&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="nf"&gt;getPreferredCurrency&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;preferredCurrency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each customer in our system has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A unique user ID: This lets us associate transactions with a specific person.
&lt;/li&gt;
&lt;li&gt;Trusted merchants: The places they frequently shop.
&lt;/li&gt;
&lt;li&gt;Trusted categories: The types of purchases they regularly make.
&lt;/li&gt;
&lt;li&gt;Average spending habits: Their typical purchase amount and the standard deviation.
&lt;/li&gt;
&lt;li&gt;Preferred currency: The currency they usually transact in.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows us to establish a baseline for their spending behavior. If a new transaction deviates too much, we can flag it as potentially fraudulent. We need to add a couple of helper functions to help us in our data generation. We’ll create the &lt;code&gt;getFrequentCategory()&lt;/code&gt; method, to randomly select one of the user's preferred categories. Open the Customer class and add the following method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="nf"&gt;getFrequentCategory&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ll also add the &lt;code&gt;getUnfrequentCategory()&lt;/code&gt; method, to pick a category they don’t usually spend money on—which could be a red flag.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="nf"&gt;getUnfrequentCategory&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="c1"&gt;// Get all categories from the enum  &lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;allCategories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  

        &lt;span class="c1"&gt;// Filter out frequent categories  &lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;infrequentCategories&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;allCategories&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; 
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="c1"&gt;// Pick a random category from the remaining ones  &lt;/span&gt;
        &lt;span class="nc"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;infrequentCategories&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;infrequentCategories&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a user who only shops at grocery stores suddenly makes a high-ticket tech purchase, we might need to flag that as fraud.&lt;/p&gt;

&lt;p&gt;Currency changes can be another strong indicator of fraud. If a user always transacts in USD, but suddenly makes a purchase in EUR, that could signal fraudulent activity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="nf"&gt;getRandomSuspiciousCurrency&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="c1"&gt;// Get all categories from the enum  &lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;allCurrency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;values&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  

        &lt;span class="c1"&gt;// Filter out frequent categories  &lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;infrequentCurrency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;allCurrency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currency&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!(&lt;/span&gt;&lt;span class="n"&gt;preferredCurrency&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; 
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  

        &lt;span class="c1"&gt;// Pick a random category from the remaining ones  &lt;/span&gt;
        &lt;span class="nc"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;infrequentCurrency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;infrequentCurrency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This method helps simulate a realistic fraud scenario, where someone might steal a card and use it overseas.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Customer seeding&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Before we can generate transactions, we need some customers in our database. These aren't just random users—we're giving them specific spending habits, preferred merchants, and common transaction patterns. This will help us create realistic data for fraud detection. Create a &lt;code&gt;service&lt;/code&gt; package and add a &lt;code&gt;CustomerSeeder&lt;/code&gt; class. Copy the following code into it, then we'll break down what is happening.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Merchant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Customer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jakarta.annotation.PostConstruct&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.Logger&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.MongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CustomerSeeder&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLogger&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;CustomerSeeder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoTemplate&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;CustomerSeeder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoTemplate&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mongoTemplate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@PostConstruct&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;seedCustomers&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getCollection&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"customers"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;countDocuments&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Customers already exist. Skipping seed."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  

        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_1"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;AMAZON&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BEST_BUY&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TECH&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;150.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;30.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;USD&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_2"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;WALMART&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TARGET&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DUNNES_STORES&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GROCERY&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;80.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;20.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;USD&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_3"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;APPLE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;MICROSOFT&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TECH&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;250.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EUR&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_4"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ETSY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BEST_BUY&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;100.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;25.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EUR&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_5"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ETSY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EBAY&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;90.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;20.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GBP&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_6"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TESCO&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;DUNNES_STORES&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GROCERY&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;40.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EUR&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_7"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;LIDL&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;COSTCO&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GROCERY&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;35.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;8.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EUR&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_8"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GOOGLE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;MICROSOFT&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TECH&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;15.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;USD&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_9"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;EBAY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ETSY&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;60.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;15.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GBP&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user_10"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;COSTCO&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;IKEA&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;RETAIL&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="mf"&gt;25.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;7.0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;GBP&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;insertAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Customers seeded successfully!"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let’s walk through the implementation.&lt;/p&gt;

&lt;p&gt;You might be wondering why we’re not using &lt;code&gt;MongoRepository&lt;/code&gt; for this. While &lt;code&gt;MongoRepository&lt;/code&gt; is great for simple CRUD operations, it doesn't provide full control over how we seed our database.&lt;/p&gt;

&lt;p&gt;Using MongoTemplate, we can directly check if data exists before inserting (so we don’t create duplicates).&lt;/p&gt;

&lt;p&gt;We use Spring’s &lt;code&gt;@PostConstruct&lt;/code&gt; annotation to run the &lt;code&gt;seedCustomers()&lt;/code&gt; method automatically when the application starts. This ensures our database is populated before transactions start flowing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@PostConstruct&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;seedCustomers&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;If customer data already exists in the MongoDB collection, we skip seeding.
&lt;/li&gt;
&lt;li&gt;Otherwise, we generate 10 customers, each with:

&lt;ul&gt;
&lt;li&gt;A user ID.
&lt;/li&gt;
&lt;li&gt;A list of trusted merchants.
&lt;/li&gt;
&lt;li&gt;A list of frequent categories.
&lt;/li&gt;
&lt;li&gt;A mean spending amount and standard deviation.
&lt;/li&gt;
&lt;li&gt;A preferred currency.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Once the data is created, we use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;insertAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thanks to the magic of Spring and our &lt;code&gt;@PostContruct&lt;/code&gt; annotation, mongoTemplate will bulk insert everything into our MongoDB database just by running the application.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the sample customers are structured
&lt;/h3&gt;

&lt;p&gt;Each customer is designed to behave predictably. For the sake of this demonstration, very predictably:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tech shoppers: Spend more on Amazon, Best Buy, and Apple.
&lt;/li&gt;
&lt;li&gt;Grocery shoppers: Frequent Tesco, Dunnes Stores, or Lidl.
&lt;/li&gt;
&lt;li&gt;General retail shoppers: Shop at Etsy, eBay, or Walmart.
&lt;/li&gt;
&lt;li&gt;Spending amounts vary but follow a general pattern.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With this customer seeding in place:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We have a structured dataset to test fraud detection.
&lt;/li&gt;
&lt;li&gt;Customers behave in predictable ways, making anomalies easier to catch.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the customers are in MongoDB, we can move on to generating transactions.&lt;/p&gt;

&lt;h3&gt;
  
  
  The transaction model
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;Transaction&lt;/code&gt; model represents individual financial transactions in our system. These transactions are generated based on our synthetic customer profiles and are stored in MongoDB. This model plays a key role in detecting anomalies by allowing us to compare each new transaction against a user’s historical behavior.&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;Transaction&lt;/code&gt; class inside the &lt;code&gt;model&lt;/code&gt; package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.fasterxml.jackson.annotation.JsonIgnoreProperties&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.fasterxml.jackson.annotation.JsonProperty&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.enums.Merchant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.BsonDocument&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.codecs.configuration.CodecRegistry&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.conversions.Bson&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.annotation.Id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.core.mapping.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.time.Instant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Random&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.UUID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;static&lt;/span&gt; &lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;frauddetector&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;enums&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Merchant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRandomMerchant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@JsonIgnoreProperties&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ignoreUnknown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"transactions"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nd"&gt;@JsonProperty&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"transaction_id"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Instant&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt; &lt;span class="n"&gt;merchant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isFraud&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;{};&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getTransactionId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setTransactionId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;transactionId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getUserId&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setUserId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;getAmount&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setAmount&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="nf"&gt;getCurrency&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCurrency&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Instant&lt;/span&gt; &lt;span class="nf"&gt;getTimestamp&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setTimestamp&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Instant&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Merchant&lt;/span&gt; &lt;span class="nf"&gt;getMerchant&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;merchant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setMerchant&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Merchant&lt;/span&gt; &lt;span class="n"&gt;merchant&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;merchant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;merchant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="nf"&gt;getCategory&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setCategory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;isFraud&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;isFraud&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setFraud&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;fraud&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;isFraud&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fraud&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="nf"&gt;getEmbedding&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;setEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="o"&gt;}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each transaction has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A unique transaction ID: Used to track and reference individual transactions.
&lt;/li&gt;
&lt;li&gt;A user ID: Links the transaction to a specific customer.
&lt;/li&gt;
&lt;li&gt;Amount: The total cost of the transaction.
&lt;/li&gt;
&lt;li&gt;Currency: The currency in which the transaction was made.
&lt;/li&gt;
&lt;li&gt;Timestamp: The exact time the transaction took place.
&lt;/li&gt;
&lt;li&gt;Merchant: The vendor where the transaction occurred.
&lt;/li&gt;
&lt;li&gt;Category: The type of purchase made.
&lt;/li&gt;
&lt;li&gt;Fraud flag: A boolean value indicating whether the transaction is marked as fraudulent.
&lt;/li&gt;
&lt;li&gt;Embedding: A numerical representation of the transaction generated via AI embeddings, which we’ll use for vector search.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At the top of our class, we specify which collection we want our financial transactions stored in, &lt;code&gt;@Document(collection = "transactions")&lt;/code&gt;. Spring Data MongoDB handles the BSON serialization for us.&lt;/p&gt;

&lt;p&gt;To detect fraudulent transactions, we’ll use vector search to compare transactions based on their overall context rather than exact matching. The &lt;code&gt;generateEmbeddingText()&lt;/code&gt; method creates a string representation of the transaction, which we’ll later convert into an embedding.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddingText&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;merchant&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Generating random transactions
&lt;/h3&gt;

&lt;p&gt;We need to simulate realistic transaction data for our customers. The &lt;code&gt;generateRandom()&lt;/code&gt; method creates our synthetic transactions that either align with or deviate from a user's normal spending patterns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt; &lt;span class="nf"&gt;generateRandom&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Customer&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Generate a normal or suspicious transaction&lt;/span&gt;
    &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isSuspicious&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextDouble&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 10% chance of fraud&lt;/span&gt;

    &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;isSuspicious&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMeanSpending&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextDouble&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;// Unusually large&lt;/span&gt;
            &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMeanSpending&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextDouble&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

    &lt;span class="nc"&gt;Category&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;isSuspicious&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getUnfrequentCategory&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getFrequentCategory&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="nc"&gt;Merchant&lt;/span&gt; &lt;span class="n"&gt;merchant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;getRandomMerchant&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;Currency&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;isSuspicious&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRandomSuspiciousCurrency&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getPreferredCurrency&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="no"&gt;UUID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;randomUUID&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getUserId&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Instant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;merchant&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This method randomly decides whether the transaction is suspicious (10% chance). Feel free to adjust this to suit your use case you want to test for. It then sets the amount to either a normal range or an unusually high value. Next, we choose a category and merchant based on the user’s normal spending behavior (or select a category they don’t usually shop in if the transaction is suspicious). Lastly, we select a currency, potentially choosing an unusual one for suspicious transactions.&lt;/p&gt;

&lt;p&gt;This &lt;code&gt;Transaction&lt;/code&gt; model is the backbone of our fraud detection pipeline. Once generated, transactions will be sent to Kafka, processed, and stored in MongoDB, where we’ll use MongoDB Change Streams and Vector Search to flag potential fraud.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do we create our AI embeddings?
&lt;/h3&gt;

&lt;p&gt;For this example, we are going to use the OpenAI API embedding model for no other reason than it's simple. It is general purpose so it will work, and Spring AI makes it very straightforward to set up. Now, I highly recommend looking at a tailored embedding model for fraud detection for your application. If you want to learn how to do this, check out my tutorial on &lt;a href="https://www.mongodb.com/developer/languages/java/build-fraud-detection-model-java-deeplearning4j/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;how to train your own fraud detection model in Java&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;With our API key and embedding model configured in the &lt;code&gt;application.properties&lt;/code&gt; from earlier, we need to create a &lt;code&gt;config&lt;/code&gt; package, and add a new &lt;code&gt;OpenAIConfig&lt;/code&gt; class.&lt;/p&gt;

&lt;p&gt;Here, we just need to create a configuration class that takes in the API key, and sets up the OpenAI embedding model using Spring AI.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.config&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.embedding.EmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.openai.OpenAiEmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.openai.api.OpenAiApi&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.beans.factory.annotation.Value&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Bean&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.context.annotation.Configuration&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Configuration&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OpenAIConfig&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nd"&gt;@Value&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${spring.ai.openai.api-key}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="nf"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;OpenAiEmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAiApi&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, our embedding model is ready to be called upon for generating our vectors to store alongside our transactions in MongoDB.&lt;/p&gt;

&lt;p&gt;With our embedding model set up, in our &lt;code&gt;Service&lt;/code&gt; package, we can add our new class, &lt;code&gt;EmbeddingGenerator&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.ai.embedding.EmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Component&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Component&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingGenerator&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;EmbeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="nf"&gt;getEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embeddingResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;embeddingResponse&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will take the information we want to include in our embedding for each transaction, and generate this embedding using our embedding model.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;@Component&lt;/code&gt; notation allows Spring to detect our custom beans automatically. Basically, it lets us instantiate a class and inject any specified dependencies into it, and inject them wherever needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to simply interact with our database with Spring Data MongoDB
&lt;/h3&gt;

&lt;p&gt;For our simple CRUD operations for our customers and transactions, finds, and inserts mostly, we can use &lt;code&gt;MongoRepository&lt;/code&gt; to get this done nice and simple. Create a package &lt;code&gt;repository&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;First, we'll create a &lt;code&gt;CustomerRepository&lt;/code&gt; interface.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Customer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;CustomerRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we'll create a &lt;code&gt;TransactionRepository&lt;/code&gt; interface.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.repository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Transaction&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.data.mongodb.repository.MongoRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;TransactionRepository&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;MongoRepository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We don't need to do anything else. &lt;code&gt;MongoRepository&lt;/code&gt; provides us with all the CRUD operations we need. We don't need to create any custom query implementations. If you want to see more about what &lt;code&gt;MongoRepository&lt;/code&gt; provides, and the differences between &lt;code&gt;MongoRepository&lt;/code&gt; and &lt;code&gt;MongoTemplate&lt;/code&gt;, check out our article about &lt;a href="https://www.mongodb.com/developer/products/mongodb/springdata-getting-started-with-java-mongodb/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;getting started with Spring Data MongoDB&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Seed the transactions for our customers
&lt;/h3&gt;

&lt;p&gt;Now that we have customers in our database, we need to generate some transactions for them. These transactions will serve as our baseline data, allowing us to compare future transactions against historical spending patterns.&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;TransactionSeeder&lt;/code&gt; class inside the &lt;code&gt;service&lt;/code&gt; package and copy the following code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.MongoCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Customer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Transaction&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.repository.CustomerRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.repository.TransactionRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jakarta.annotation.PostConstruct&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.Logger&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.ArrayList&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransactionSeeder&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLogger&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TransactionSeeder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;CustomerRepository&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingGenerator&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;TransactionRepository&lt;/span&gt; &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;TransactionChangeStreamListener&lt;/span&gt; &lt;span class="n"&gt;transactionChangeStreamListener&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TransactionSeeder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;CustomerRepository&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                             &lt;span class="nc"&gt;EmbeddingGenerator&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TransactionRepository&lt;/span&gt; &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                             &lt;span class="nc"&gt;TransactionChangeStreamListener&lt;/span&gt; &lt;span class="n"&gt;transactionChangeStreamListener&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;transactionsCollection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;transactionRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;customerRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingGenerator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;transactionChangeStreamListener&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionChangeStreamListener&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This service is responsible for generating sample transactions for each customer and creating the embeddings, then storing them in MongoDB.&lt;/p&gt;

&lt;p&gt;It is also responsible for starting the Change Stream listener to detect real-time transactions after the database has been seeded. We make sure to do this after the database has been seeded. We are again accessing the MongoDB Java driver directly, to allow us to verify if the collection has already been seeded.&lt;/p&gt;

&lt;p&gt;Next, we need a post construct method, &lt;code&gt;seedTransactions()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;   &lt;span class="nd"&gt;@PostConstruct&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;seedTransactions&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;countDocuments&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Transactions already seeded."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Customer&lt;/span&gt; &lt;span class="n"&gt;customer&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;Transaction&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateRandom&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;embeddingText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateEmbeddingText&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingText&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;saveAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Seeded 100 transactions."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;transactionChangeStreamListener&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;startListening&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Change Stream Listener Started."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the service is initialized, the &lt;code&gt;@PostConstruct&lt;/code&gt; method runs, ensuring that our transactions are created before the system starts processing new ones. We check if the transactions collection already contains data. If it does, we skip the seeding process. Otherwise, we generate 10 transactions per customer.&lt;/p&gt;

&lt;p&gt;We fetch all customers from the database. Each customer gets a fixed amount of 10 historical transactions, and each transaction is randomly generated based on the customer's typical spending habits.&lt;/p&gt;

&lt;p&gt;Finally, we batch insert our transactions, and we start listening for real-time transactions so that new ones can be evaluated for fraud detection.&lt;/p&gt;

&lt;h3&gt;
  
  
  How we'll send our transactions with a Kafka Producer
&lt;/h3&gt;

&lt;p&gt;Now that we've seeded our customer data and generated initial transactions, we need a way to continuously produce transactions in real time. We'll be using a Kafka Producer for this. The producer will generate synthetic transactions, compute their embeddings, and send them to Kafka for further processing.&lt;/p&gt;

&lt;p&gt;Kafka follows a publish-subscribe model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Producers send messages to a Kafka topic (in this case, &lt;code&gt;"transactions"&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;Consumers subscribe to that topic and process incoming messages.
&lt;/li&gt;
&lt;li&gt;Topics store and distribute messages to consumers, ensuring scalability.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Create a &lt;code&gt;TransactionProducer&lt;/code&gt; class inside the &lt;code&gt;service&lt;/code&gt; package and copy the following code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Customer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Transaction&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.repository.CustomerRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jakarta.annotation.PostConstruct&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.Logger&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.core.KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.scheduling.annotation.Scheduled&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Random&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransactionProducer&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLogger&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TransactionProducer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;TOPIC&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"transactions"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingGenerator&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Customer&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;CustomerRepository&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TransactionProducer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;KafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingGenerator&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;CustomerRepository&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kafkaTemplate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingGenerator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;customerRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spring's Kafka template is used to send messages to a Kafka topic. Next, we need a method to fetch customer data to create transactions. Create a &lt;code&gt;@PostConstruct&lt;/code&gt; method.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@PostConstruct&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;loadCustomers&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customerRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Warning: No customers found! Transactions may fail."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Cached {} customers for transaction generation."&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We cache customer data in memory when the application starts. This avoids repeated database lookups and speeds up transaction generation. Now, we need a method to generate our synthetic transactions. We will use the &lt;code&gt;@Scheduled&lt;/code&gt; annotation to create a &lt;code&gt;generateAndSendTransaction()&lt;/code&gt; method to run every 100ms.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@Scheduled&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fixedRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;generateAndSendTransaction&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"No customers available. Skipping transaction generation."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
        &lt;span class="nc"&gt;Transaction&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateRandom&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;nextInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;())));&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;embeddingText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateEmbeddingText&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingGenerator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingText&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  
        &lt;span class="n"&gt;kafkaTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;TOPIC&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getTransactionId&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Transaction sent to topic {}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;TOPIC&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This randomly selects a customer, creates a transaction, and generates an embedding. It then sends the transaction to Kafka.&lt;/p&gt;

&lt;p&gt;We also need to add &lt;code&gt;@EnableScheduling&lt;/code&gt; to our &lt;code&gt;FrauddetectorApplication&lt;/code&gt; class to allow us to schedule these transactions for our demo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.SpringApplication&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.boot.autoconfigure.SpringBootApplication&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.scheduling.annotation.EnableScheduling&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@SpringBootApplication&lt;/span&gt;  
&lt;span class="nd"&gt;@EnableScheduling&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FrauddetectorApplication&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
       &lt;span class="nc"&gt;SpringApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;FrauddetectorApplication&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that our Kafka Producer is generating transactions and sending them to a Kafka topic, we need a way to consume them and store them in MongoDB. This is where the Kafka Consumer comes in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ingest our transactions
&lt;/h2&gt;

&lt;p&gt;This step is fairly simple using &lt;code&gt;spring-kafka&lt;/code&gt;. In the &lt;code&gt;service&lt;/code&gt; package, add a &lt;code&gt;TransactionConsumer&lt;/code&gt; class. We'll set up a simple Kafka listener to automatically process incoming transaction messages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.model.Transaction&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.repository.TransactionRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.kafka.annotation.KafkaListener&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransactionConsumer&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;TransactionRepository&lt;/span&gt; &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TransactionConsumer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TransactionRepository&lt;/span&gt; &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;transactionRepository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="nd"&gt;@KafkaListener&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"transactions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;groupId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"fraud-group"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;consumeTransaction&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Transaction&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="n"&gt;transactionRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How Kafka Consumers work
&lt;/h3&gt;

&lt;p&gt;Kafka follows a publish-subscribe model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Producers publish messages to a topic (in this case, &lt;code&gt;"transactions"&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;Consumers subscribe to that topic and process incoming messages.
&lt;/li&gt;
&lt;li&gt;Consumers can be grouped into consumer groups, where each instance reads a portion of the messages.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For our system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Our Kafka Producer generates transactions and publishes them to Kafka.
&lt;/li&gt;
&lt;li&gt;Our Kafka Consumer listens to the &lt;code&gt;"transactions"&lt;/code&gt; topic.
&lt;/li&gt;
&lt;li&gt;Each transaction is stored in MongoDB via &lt;code&gt;TransactionRepository&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With our transactions being produced and consumed, we can now look at how we will monitor incoming transactions on our MongoDB database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitor our database with Change Streams
&lt;/h2&gt;

&lt;p&gt;Once transactions are stored in MongoDB, we need a way to detect new transactions in real time and trigger fraud detection immediately. Instead of repeatedly querying the database for new data, we can use &lt;a href="https://www.mongodb.com/docs/manual/changeStreams/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Change Streams&lt;/a&gt; to efficiently listen for changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is a change stream?
&lt;/h3&gt;

&lt;p&gt;MongoDB Change Streams provide a real-time feed that allows applications to listen for inserts, updates, deletes, and replacements happening in a collection or database. It works similarly to an event listener, triggering an action whenever a change occurs.&lt;/p&gt;

&lt;p&gt;Instead of running queries to check for new transactions, Change Streams push updates to our application as soon as they occur, making fraud detection more efficient and reducing unnecessary database calls.&lt;/p&gt;

&lt;p&gt;So why use Change Streams?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time monitoring: Detects fraud immediately after a transaction is inserted.
&lt;/li&gt;
&lt;li&gt;Push-based updates: Eliminates the need for polling or manual querying.
&lt;/li&gt;
&lt;li&gt;Optimized performance: Runs efficiently, even for high-velocity transaction data.
&lt;/li&gt;
&lt;li&gt;Built-in scalability: Works across sharded clusters and replica sets.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To monitor transactions, we create a &lt;code&gt;TransactionChangeStreamListener&lt;/code&gt; service in our &lt;code&gt;service&lt;/code&gt; package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.Aggregates&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.Filters&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.changestream.ChangeStreamDocument&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.conversions.Bson&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.Logger&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.Executors&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.ExecutorService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransactionChangeStreamListener&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLogger&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TransactionChangeStreamListener&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

   &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;TransactionVectorSearchService&lt;/span&gt; &lt;span class="n"&gt;vectorSearchService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ExecutorService&lt;/span&gt; &lt;span class="n"&gt;executorService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Executors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newSingleThreadExecutor&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Keeps it synchronous&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TransactionChangeStreamListener&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TransactionVectorSearchService&lt;/span&gt; &lt;span class="n"&gt;vectorSearchService&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;transactionsCollection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;vectorSearchService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorSearchService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need to create a method &lt;code&gt;startListening()&lt;/code&gt; to begin monitoring our database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;startListening&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;executorService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;submit&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Filter to only listen for INSERT operations&lt;/span&gt;
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bson&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;eq&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"operationType"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"insert"&lt;/span&gt;&lt;span class="o"&gt;)));&lt;/span&gt;

            &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoCursor&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ChangeStreamDocument&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionsCollection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;iterator&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasNext&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="nc"&gt;ChangeStreamDocument&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;next&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                    &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;transactionDoc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getFullDocument&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transactionDoc&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"New transaction detected: {}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transactionDoc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"transactionId"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

                        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionDoc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embedding"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Performing vector search"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                            &lt;span class="n"&gt;vectorSearchService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluateTransactionFraud&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transactionDoc&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Warning: Transaction does not contain an embedding field."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                        &lt;span class="o"&gt;}&lt;/span&gt;
                    &lt;span class="o"&gt;}&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;});&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, the change stream monitors all operations on our collection. We don't care about the updates or deletes for our use case, so we use the aggregation pipeline to filter for only inserts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Bson&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;eq&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"operationType"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"insert"&lt;/span&gt;&lt;span class="o"&gt;)));&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With each incoming transaction, we extract the embedding and use it to run a vector search (we are going to implement this next). If this returns similar transactions marked fraud, we are going to mark this new transaction as fraud. If we don't find any similar transactions for that customer, we are also going to mark it as fraud. We will go more into the implementation of this logic in the next section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fraud detection with vector search
&lt;/h2&gt;

&lt;p&gt;Fraud detection in this system relies on MongoDB Atlas Vector Search, which allows us to compare transaction embeddings against historical transactions to determine if a new transaction is suspicious.  Create a class &lt;code&gt;TransactionVectorSearchService&lt;/code&gt; in the &lt;code&gt;service&lt;/code&gt; package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.frauddetector.service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.MongoCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.Aggregates&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.Filters&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.Updates&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.search.VectorSearchOptions&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.conversions.Bson&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.Logger&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.slf4j.LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.ArrayList&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Arrays&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;static&lt;/span&gt; &lt;span class="n"&gt;com&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mongodb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SearchPath&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fieldPath&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;  
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TransactionVectorSearchService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoggerFactory&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLogger&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;TransactionVectorSearchService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transactionCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;VECTOR_INDEX_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"vector_index"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Ensure this matches your Atlas index name  &lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;SEARCH_LIMIT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Number of similar transactions to retrieve  &lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;NUM_CANDIDATES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Number of approximate neighbors to consider  &lt;/span&gt;

   &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;TransactionVectorSearchService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoCollection&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transactionCollection&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;transactionCollection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionCollection&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;TransactionVectorSearchService&lt;/code&gt; is responsible for:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Running vector search on each new transaction.
&lt;/li&gt;
&lt;li&gt;Retrieving similar transactions from the database.
&lt;/li&gt;
&lt;li&gt;Determining fraud likelihood based on similarity.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now, this first method we will create is the method we are calling in our Change Stream listener. Create the &lt;code&gt;evaluateTransactionFraud()&lt;/code&gt; method that will take in our transaction document as a parameter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;evaluateTransactionFraud&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;transactionDoc&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionDoc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"transactionId"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionDoc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"userId"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transactionDoc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embedding"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="c1"&gt;// Run vector search to find similar transactions  &lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;similarTransactions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;findSimilarTransactions&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="c1"&gt;// If no similar transactions exist for this user OR any of them are fraud -&amp;gt; Mark as fraud  &lt;/span&gt;
        &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isFraud&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similarTransactions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;  
                &lt;span class="n"&gt;similarTransactions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;anyMatch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBoolean&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"isFraud"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;isFraud&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;markTransactionAsFraud&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First, we're going to extract our &lt;code&gt;transactionId&lt;/code&gt;, our &lt;code&gt;UserId&lt;/code&gt;, and our &lt;code&gt;Embedding&lt;/code&gt;. Next, we'll call a helper function  &lt;code&gt;findSimilarTransactions()&lt;/code&gt;, and pass in our embeddings and the user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;findSimilarTransactions&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;Bson&lt;/span&gt; &lt;span class="n"&gt;vectorSearch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="n"&gt;fieldPath&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embedding"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="no"&gt;VECTOR_INDEX_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="no"&gt;SEARCH_LIMIT&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
                &lt;span class="nc"&gt;VectorSearchOptions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;approximateVectorSearchOptions&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;NUM_CANDIDATES&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="nc"&gt;Bson&lt;/span&gt; &lt;span class="n"&gt;matchUser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;eq&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"userId"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;  

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;transactionCollection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;matchUser&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;  
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;into&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;());&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;findSimilarTransactions()&lt;/code&gt;, we create an aggregation pipeline. Here, we set up our vector search to return the top five similar transactions, and filter by the user. We return this to our &lt;code&gt;evaluateTransactionFraud()&lt;/code&gt; method, where we run some fraud detection logic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;        &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isFraud&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similarTransactions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;  
                &lt;span class="n"&gt;similarTransactions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;anyMatch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBoolean&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"isFraud"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will return &lt;code&gt;isFraud = True&lt;/code&gt; if we can't find any similar transactions for that user, or if any of the returned transactions are marked as fraud. If this is the case, we call the method, &lt;code&gt;markTransactionAsFraud()&lt;/code&gt;, and pass in the &lt;code&gt;transactionId&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;markTransactionAsFraud&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="n"&gt;transactionCollection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;updateOne&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
                &lt;span class="nc"&gt;Filters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;eq&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"transactionId"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;  
                &lt;span class="nc"&gt;Updates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;set&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"isFraud"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
        &lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Transaction marked as fraud: {}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transactionId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will update the document in our MongoDB collection.&lt;/p&gt;

&lt;p&gt;Now, this logic works perfectly fine for my simple little demo, but will it work for you? Maybe fraudulent transactions look remarkably similar across all your customers. In this case, it would be advisable to not filter by user. Maybe a larger number of transactions need to be passed back to get a more thorough look at your users’ spending history. It might also make sense to maintain a list of likely fraudulent vendors to automatically flag. In the next section, we go through some of the limitations of this implementation and discuss how you may tailor it to your use case.&lt;/p&gt;

&lt;h3&gt;
  
  
  Limitations of our application
&lt;/h3&gt;

&lt;p&gt;While vector search is a powerful tool for detecting fraudulent transactions, there are a few notable limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MongoDB Atlas Vector Search does not support batching multiple queries. This means each transaction must be evaluated individually, which can lead to performance bottlenecks in high-throughput environments.
&lt;/li&gt;
&lt;li&gt;Since each transaction must be queried separately, this approach may become slow when processing hundreds upon thousands of transactions per second.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Atlas Triggers for monitoring incoming transactions
&lt;/h3&gt;

&lt;p&gt;To reduce latency and avoid excessive document transfers between the application and MongoDB, leverage &lt;a href="https://www.mongodb.com/docs/atlas/atlas-ui/triggers/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly#std-label-atlas-triggers" rel="noopener noreferrer"&gt;MongoDB Atlas Triggers&lt;/a&gt;. Instead of running vector search from the application, we can move that logic directly into MongoDB using an Atlas trigger.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Benefits

&lt;ul&gt;
&lt;li&gt;Reduces latency. Since the vector search runs within MongoDB, we avoid the overhead of sending documents back and forth between the database and the application.
&lt;/li&gt;
&lt;li&gt;Offloads processing to MongoDB. This means the application doesn't need to handle the change stream, freeing up resources for other tasks.
&lt;/li&gt;
&lt;li&gt;Ensures real-time fraud detection. As soon as a transaction is stored, it is immediately analyzed for fraud inside MongoDB.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Limitations

&lt;ul&gt;
&lt;li&gt;Atlas Triggers have a hard timeout of 30 seconds, which may be problematic for high volumes of transactions or complex fraud detection logic.
&lt;/li&gt;
&lt;li&gt;Less flexible than application-side logic. If we need customized fraud detection workflows, it may be harder to modify compared to handling it in our application.
&lt;/li&gt;
&lt;li&gt;The trigger will still need to run a vector search for each transaction, which means batching remains a challenge.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Implementation

&lt;ul&gt;
&lt;li&gt;MongoDB Atlas supports triggers that can execute logic when new documents are inserted into a collection.
&lt;/li&gt;
&lt;li&gt;Instead of listening for changes in the application, we can set up a trigger on the transactions collection.
&lt;/li&gt;
&lt;li&gt;When a new transaction is inserted, the trigger can run a vector search within MongoDB and flag the transaction as fraud if necessary.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Best Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Low-latency fraud detection&lt;/td&gt;
&lt;td&gt;MongoDB Trigger&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need for flexible fraud detection rules&lt;/td&gt;
&lt;td&gt;Application Logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High transaction volume (thousands/sec)&lt;/td&gt;
&lt;td&gt;Triggers (30s limit)—consider batching in-memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Want to avoid excessive application queries&lt;/td&gt;
&lt;td&gt;MongoDB Trigger&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Other possible solutions
&lt;/h3&gt;

&lt;p&gt;Maintain a cache  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store recent transactions in-memory and check against them before running vector search.
&lt;/li&gt;
&lt;li&gt;Benefits:

&lt;ul&gt;
&lt;li&gt;Reduces database query load by leveraging local memory.
&lt;/li&gt;
&lt;li&gt;Speeds up fraud detection by avoiding unnecessary queries.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Implementation:

&lt;ul&gt;
&lt;li&gt;Store the last N transactions per user.
&lt;/li&gt;
&lt;li&gt;Check if the embedding is already in the cache before querying MongoDB.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Run local similarity search  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use a local similarity search algorithm (e.g., cosine similarity) before querying MongoDB Vector Search.
&lt;/li&gt;
&lt;li&gt;Benefits:

&lt;ul&gt;
&lt;li&gt;Filters out transactions less likely to be fraud, reducing unnecessary vector searches.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Implementation:

&lt;ul&gt;
&lt;li&gt;Store recent fraud transaction embeddings in-memory.
&lt;/li&gt;
&lt;li&gt;Run a cosine similarity check locally.
&lt;/li&gt;
&lt;li&gt;Only high-similarity transactions go to MongoDB Vector Search.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Use a fraud likelihood score  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Assign a fraud likelihood score based on transaction metadata (e.g., unusual amount, category, merchant).
&lt;/li&gt;
&lt;li&gt;Benefits:

&lt;ul&gt;
&lt;li&gt;Prioritizes high-risk transactions for vector search.
&lt;/li&gt;
&lt;li&gt;Reduces unnecessary vector searches for low-risk transactions.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Implementation:

&lt;ul&gt;
&lt;li&gt;Define rules for suspicious transactions (e.g., transaction amount is 10x higher than normal).
&lt;/li&gt;
&lt;li&gt;Transactions with a high fraud score are sent to MongoDB Vector Search.
&lt;/li&gt;
&lt;li&gt;Others skip vector search and are stored directly.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;When it comes to your application, it is likely going to be a hybrid approach. Each application will have its own considerations, needs, and limitations. In any case, a hybrid approach could work—using triggers to flag transactions and the application for deeper fraud analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run our application
&lt;/h2&gt;

&lt;p&gt;So with our fraud detection pipeline set up, it's time to run our application. First, we need to start our Kafka.&lt;/p&gt;

&lt;h3&gt;
  
  
  Starting Apache Kafka
&lt;/h3&gt;

&lt;p&gt;Before starting Kafka for the first time, we need to format its metadata log directory. This only needs to be done once.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;KAFKA_CLUSTER_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;bin/kafka-storage.sh random-uuid&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
bin/kafka-storage.sh format &lt;span class="nt"&gt;--standalone&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="nv"&gt;$KAFKA_CLUSTER_ID&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; config/server.properties
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This initializes the internal metadata quorum and sets up Kafka’s KRaft storage directory. Now, we start the Kafka broker, using our config.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bin/kafka-server-start.sh config/server.properties
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The broker handles all messaging: storing, forwarding, and distributing messages between producers and consumers. Since we’re running Kafka in standalone mode, this single broker is responsible for everything. If everything starts successfully, you should see Kafka logs indicating that the broker is up and running.&lt;/p&gt;

&lt;p&gt;Lastly, we need to open up a terminal and create our &lt;code&gt;transactions&lt;/code&gt; topic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bin/kafka-topics.sh &lt;span class="nt"&gt;--create&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--topic&lt;/span&gt; transactions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--bootstrap-server&lt;/span&gt; localhost:9092 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--partitions&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--replication-factor&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command creates a topic named &lt;code&gt;transactions&lt;/code&gt;. It uses one partition for simplicity, but in production, you would likely want more for scalability. It also sets the replication factor to 1, since we're running only a single broker.&lt;/p&gt;

&lt;h3&gt;
  
  
  Run our Spring app
&lt;/h3&gt;

&lt;p&gt;Now that we have our Kafka up and running, time to run our app.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn clean spring-boot:run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There will be some start-up time, about 100 seconds, to generate the sample data and populate the database. If everything is working, you should see logs indicating:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Customers have been seeded.
&lt;/li&gt;
&lt;li&gt;Transactions are being generated and sent to Kafka.
&lt;/li&gt;
&lt;li&gt;Kafka Consumer is saving transactions in MongoDB.
&lt;/li&gt;
&lt;li&gt;Change Stream is monitoring new transactions.
&lt;/li&gt;
&lt;li&gt;Vector Search is detecting fraud in real-time.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hangeStreamListener  : Performing vector search
2025-02-24T15:29:56.479Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;   scheduling-1] c.m.f.service.TransactionProducer        : Transaction sent to topic transactions
2025-02-24T15:29:56.520Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;pool-2-thread-1] c.m.f.s.TransactionChangeStreamListener  : New transaction detected: 3ddc2181-5bb3-42b8-8b7b-953aaf098429
2025-02-24T15:29:56.520Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;pool-2-thread-1] c.m.f.s.TransactionChangeStreamListener  : Performing vector search
2025-02-24T15:29:56.628Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;pool-2-thread-1] c.m.f.s.TransactionVectorSearchService   : Transaction marked as fraud: 3ddc2181-5bb3-42b8-8b7b-953aaf098429
2025-02-24T15:29:56.839Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;   scheduling-1] c.m.f.service.TransactionProducer        : Transaction sent to topic transactions
2025-02-24T15:29:56.881Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;pool-2-thread-1] c.m.f.s.TransactionChangeStreamListener  : New transaction detected: 7354568d-7a39-4708-904b-b14c913ce2d0
2025-02-24T15:29:56.881Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;pool-2-thread-1] c.m.f.s.TransactionChangeStreamListener  : Performing vector search
2025-02-24T15:29:57.050Z  INFO 81047 &lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;frauddetector] &lt;span class="o"&gt;[&lt;/span&gt;   scheduling-1] c.m.f.service.TransactionProducer        : Transaction sent to topic transactions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;So, we now have a Spring app that can stream in financial transactions from Apache Kafka, and upon insertion into the database, can use Atlas Vector Search to indicate whether they appear to be fraudulent.&lt;/p&gt;

&lt;p&gt;If you found this tutorial useful, check out more about what you can do with MongoDB in Java, such as &lt;a href="https://www.mongodb.com/developer/languages/java/building-applications-temporal/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building+a+Real-Time+AI+Fraud+Detection+System+with+Spring+Kafka+and+MongoDB&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;building invincible applications with Temporal and MongoDB&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>kafka</category>
      <category>spring</category>
      <category>ai</category>
    </item>
    <item>
      <title>How to Make a RAG Application With LangChain4j</title>
      <dc:creator>Tim Kelly</dc:creator>
      <pubDate>Thu, 13 Feb 2025 17:13:54 +0000</pubDate>
      <link>https://forem.com/mongodb/how-to-make-a-rag-application-with-langchain4j-1mad</link>
      <guid>https://forem.com/mongodb/how-to-make-a-rag-application-with-langchain4j-1mad</guid>
      <description>&lt;p&gt;Retrieval-augmented generation, or RAG, introduces some serious capabilities to your large language models (LLMs). These applications can answer questions about your specific corpus of knowledge, while leveraging all the nuance and sophistication of a traditional LLM.&lt;/p&gt;

&lt;p&gt;This tutorial will take you through the ins and outs of creating a Q&amp;amp;A chatbot using RAG. The application will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retrieve data from a MongoDB Atlas database.
&lt;/li&gt;
&lt;li&gt;Embed and store documents as vector embeddings.
&lt;/li&gt;
&lt;li&gt;Use LangChain4J to query the database and augment LLM prompts with the retrieved data.
&lt;/li&gt;
&lt;li&gt;Enable secure, scalable, and efficient AI-powered applications.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you want to see the completed application, it is available in the &lt;a href="https://github.com/mongodb-developer/langchainrag" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why use RAG?
&lt;/h2&gt;

&lt;p&gt;RAG works by retrieving relevant data from your knowledge base and using that information to enrich the input to the LLM. Here are some of the major benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sensitive data management: RAG allows you to use sensitive or proprietary data without incorporating it into the LLM’s training set. This ensures data privacy and security while still enabling intelligent responses, particularly useful for instances such as GDPR compliance.
&lt;/li&gt;
&lt;li&gt;Real-time updates: Instead of retraining the model to include the latest information, an expensive process, RAG enables real-time updates by pulling current data from your knowledge base.
&lt;/li&gt;
&lt;li&gt;Increased relevance: By grounding responses in your own corpus of data, RAG ensures answers are both accurate and contextually relevant.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Use cases for RAG
&lt;/h2&gt;

&lt;p&gt;RAG is well-suited for a wide variety of applications, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Q&amp;amp;A applications: Answer user queries with precision, grounded in your company’s documentation, FAQs, or internal knowledge base.
&lt;/li&gt;
&lt;li&gt;Customer support chatbots: Personalize interactions by referencing customer history, CRM data, and past interactions.
&lt;/li&gt;
&lt;li&gt;Dynamic BI tools: Enable business intelligence applications to provide insights using live operational data from databases or spreadsheets.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  LangChain4J for RAG
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://docs.langchain4j.dev/" rel="noopener noreferrer"&gt;LangChain4J&lt;/a&gt; is a Java-based library designed to simplify the integration of LLMs into Java applications by abstracting away a lot of the necessary components in your AI applications. It offers an extensive toolbox for building applications powered by retrieval-augmented generation, enabling us to build quicker and create modular applications.&lt;/p&gt;

&lt;p&gt;LangChain4J provides the building blocks to streamline your RAG implementation while maintaining full control over the underlying architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  MongoDB for RAG
&lt;/h2&gt;

&lt;p&gt;MongoDB is an ideal database for RAG implementations due to its:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Native vector search: Store and query vector embeddings directly within MongoDB alongside your operational data, enabling retrieval of relevant context.
&lt;/li&gt;
&lt;li&gt;Flexible schema: Add new fields or adjust data models easily without the absolute hassle and mayhem of a data migration.
&lt;/li&gt;
&lt;li&gt;Scalability: Handle high throughput and large datasets with MongoDB’s horizontal scaling.
&lt;/li&gt;
&lt;li&gt;Operational efficiency: Use MongoDB’s aggregation pipelines, time-series collections, and multimodal capabilities to support both RAG and non-RAG workloads.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;For this tutorial, you will need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Java 21 or higher.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://maven.apache.org/download.cgi" rel="noopener noreferrer"&gt;Maven&lt;/a&gt; or Gradle (for managing dependencies):

&lt;ul&gt;
&lt;li&gt;We use Maven for this tutorial.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;A &lt;a href="https://www.mongodb.com/cloud/atlas/register?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=How%20to%20make%20a%20RAG%20application%20with%20LangChain4j&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB Atlas account&lt;/a&gt; with a live cluster.
&lt;/li&gt;

&lt;li&gt;An &lt;a href="https://platform.openai.com/docs/overview" rel="noopener noreferrer"&gt;OpenAI API key&lt;/a&gt;.&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Our dependencies
&lt;/h2&gt;

&lt;p&gt;First things first, let's add our dependencies to our POM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;dependencies&amp;gt;  
    &amp;lt;dependency&amp;gt;  
        &amp;lt;groupId&amp;gt;dev.langchain4j&amp;lt;/groupId&amp;gt;  
        &amp;lt;artifactId&amp;gt;langchain4j-open-ai&amp;lt;/artifactId&amp;gt;  
        &amp;lt;version&amp;gt;1.0.0-alpha1&amp;lt;/version&amp;gt;  
    &amp;lt;/dependency&amp;gt;  
    &amp;lt;dependency&amp;gt;  
        &amp;lt;groupId&amp;gt;dev.langchain4j&amp;lt;/groupId&amp;gt;  
        &amp;lt;artifactId&amp;gt;langchain4j-mongodb-atlas&amp;lt;/artifactId&amp;gt;  
        &amp;lt;version&amp;gt;1.0.0-alpha1&amp;lt;/version&amp;gt;  
    &amp;lt;/dependency&amp;gt;  
    &amp;lt;dependency&amp;gt;  
        &amp;lt;groupId&amp;gt;dev.langchain4j&amp;lt;/groupId&amp;gt;  
        &amp;lt;artifactId&amp;gt;langchain4j&amp;lt;/artifactId&amp;gt;  
        &amp;lt;version&amp;gt;1.0.0-alpha1&amp;lt;/version&amp;gt;  
    &amp;lt;/dependency&amp;gt;  
    &amp;lt;dependency&amp;gt;  
        &amp;lt;groupId&amp;gt;com.fasterxml.jackson.core&amp;lt;/groupId&amp;gt;  
        &amp;lt;artifactId&amp;gt;jackson-databind&amp;lt;/artifactId&amp;gt;  
        &amp;lt;version&amp;gt;2.18.1&amp;lt;/version&amp;gt;  
    &amp;lt;/dependency&amp;gt;  
&amp;lt;/dependencies&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;langchain4j-open-ai:

&lt;ul&gt;
&lt;li&gt;Embedding generation: Allows you to use OpenAI's embedding models, like &lt;code&gt;text-embedding-ada-002&lt;/code&gt;, for transforming textual data into vector representations.
&lt;/li&gt;
&lt;li&gt;Chat model integration: Supports communication with OpenAI’s GPT models (e.g., GPT-3.5, GPT-4), enabling conversational AI capabilities.
&lt;/li&gt;
&lt;li&gt;Simplified API calls: Abstracts the complexity of interacting with the OpenAI API, reducing boilerplate code and improving developer productivity.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;langchain4j-mongodb-atlas:

&lt;ul&gt;
&lt;li&gt;Embedding store management: Simplifies storing and retrieving embeddings in MongoDB, making it an ideal solution for RAG applications.
&lt;/li&gt;
&lt;li&gt;Vector search support: Enables high-performance vector similarity queries using MongoDB Atlas's native capabilities.
&lt;/li&gt;
&lt;li&gt;Metadata handling: Allows storing and querying additional metadata associated with embeddings, which is vital for building rich, context-aware systems.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;langchain4j: Provides the tools for building RAG workflows. It includes:

&lt;ul&gt;
&lt;li&gt;Classes for text segmentation, document splitting, and chunking, which help break down large documents into manageable pieces.
&lt;/li&gt;
&lt;li&gt;Utilities for connecting and orchestrating various components like embedding models, vector stores, and content retrievers.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Jackson Databind simplifies the process of loading and handling JSON data.&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setting up MongoDB and our embedding store
&lt;/h2&gt;

&lt;p&gt;To make our retrieval-augmented generation application work effectively, we need a robust and scalable solution for storing and querying embeddings. MongoDB, with its Atlas Search capabilities, serves as the backbone for this task. In this section, we’ll walk through how to set up MongoDB and configure an embedding store using LangChain4J’s MongoDB integration.&lt;/p&gt;

&lt;h3&gt;
  
  
  MongoDB setup
&lt;/h3&gt;

&lt;p&gt;The first step is to initialize a connection to our MongoDB cluster. We use the &lt;code&gt;MongoClient&lt;/code&gt; from the MongoDB Java driver to connect to our database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.fasterxml.jackson.databind.JsonNode&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.fasterxml.jackson.databind.ObjectMapper&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.MongoClient&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.MongoClients&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb.client.model.CreateCollectionOptions&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.document.Document&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.document.DocumentSplitter&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.document.Metadata&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.embedding.Embedding&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.segment.TextSegment&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.chat.ChatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.openai.OpenAiChatModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.openai.OpenAiEmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.openai.OpenAiEmbeddingModelName&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.model.openai.OpenAiTokenizer&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.rag.content.retriever.ContentRetriever&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.*&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.mongodb.IndexMapping&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.store.embedding.mongodb.MongoDbEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.service.AiServices&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.bson.conversions.Bson&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;dev.langchain4j.data.document.splitter.DocumentSplitters&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="c1"&gt;// MongoDB setup  &lt;/span&gt;
            &lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoClients&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CONNECTION_URI"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace &lt;code&gt;"CONNECTION_URI"&lt;/code&gt; with your actual MongoDB connection string, which includes your database credentials and cluster information. This connection will be used to interact with the database and perform operations like storing and retrieving embeddings.&lt;/p&gt;

&lt;p&gt;We are also adding in our whole host of imports we will use throughout this tutorial. Don't worry, we will go through all these as we add them to our application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuring the embedding store
&lt;/h3&gt;

&lt;p&gt;The embedding store is the corpus of knowledge of our RAG application, where all embeddings and their associated metadata are stored. Let's add a method and call it from our main:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;createEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;databaseName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"rag_app"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;collectionName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"embeddings"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;indexName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"embedding"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;maxResultRatio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10L&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;CreateCollectionOptions&lt;/span&gt; &lt;span class="n"&gt;createCollectionOptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;CreateCollectionOptions&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;Bson&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;metadataFields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HashSet&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;IndexMapping&lt;/span&gt; &lt;span class="n"&gt;indexMapping&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IndexMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadataFields&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;Boolean&lt;/span&gt; &lt;span class="n"&gt;createIndex&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;MongoDbEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;databaseName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;collectionName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;indexName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;maxResultRatio&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;createCollectionOptions&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;indexMapping&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;createIndex&lt;/span&gt;
    &lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's explore the parameters we set up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Database name (&lt;code&gt;databaseName&lt;/code&gt;)
We specify &lt;code&gt;"rag_app"&lt;/code&gt; as the database where our embeddings will be stored. You can rename this to suit your application.
&lt;/li&gt;
&lt;li&gt;Collection name (&lt;code&gt;collectionName&lt;/code&gt;)
The collection &lt;code&gt;"embeddings"&lt;/code&gt; will hold the embedding data and metadata. Collections in MongoDB are analogous to tables in relational databases.
&lt;/li&gt;
&lt;li&gt;Index name (&lt;code&gt;indexName&lt;/code&gt;)
The &lt;code&gt;"embedding"&lt;/code&gt; index enables efficient vector search operations. This index is crucial for retrieving relevant embeddings quickly based on similarity scores.
&lt;/li&gt;
&lt;li&gt;Max result ratio (&lt;code&gt;maxResultRatio&lt;/code&gt;)
This defines the maximum number of results to return during retrieval, keeping the results manageable.
&lt;/li&gt;
&lt;li&gt;Create collection options (&lt;code&gt;createCollectionOptions&lt;/code&gt;)
Options for creating the collection can be customized here. For example, you could configure specific validation rules or shard keys.
&lt;/li&gt;
&lt;li&gt;Filter (&lt;code&gt;filter&lt;/code&gt;)
Currently set to &lt;code&gt;null&lt;/code&gt;, this can be used to define custom filtering criteria if needed for specific retrieval operations.
&lt;/li&gt;
&lt;li&gt;Metadata fields (&lt;code&gt;metadataFields&lt;/code&gt;)
A set of metadata field names that can be indexed alongside embeddings for richer search capabilities, this allows for queries based on both vector similarity and metadata.
&lt;/li&gt;
&lt;li&gt;Index mapping (&lt;code&gt;indexMapping&lt;/code&gt;)
This maps the dimensionality of the embedding vectors (e.g., &lt;code&gt;1536&lt;/code&gt; for OpenAI’s &lt;code&gt;text-embedding-ada-002&lt;/code&gt;). This ensures compatibility with the vector model being used.
&lt;/li&gt;
&lt;li&gt;Create index (&lt;code&gt;createIndex&lt;/code&gt;)
When set to &lt;code&gt;true&lt;/code&gt;, this flag ensures that the necessary index for vector searches is created automatically.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In the main method, we call this method and assign the result to an &lt;code&gt;EmbeddingStore&lt;/code&gt; instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="c1"&gt;// MongoDB setup  &lt;/span&gt;
            &lt;span class="nc"&gt;MongoClient&lt;/span&gt; &lt;span class="n"&gt;mongoClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MongoClients&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CONNECTION_URI"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

            &lt;span class="c1"&gt;// Embedding Store  &lt;/span&gt;
&lt;span class="nc"&gt;EmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embeddingStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;createEmbeddingStore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mongoClient&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This &lt;code&gt;embeddingStore&lt;/code&gt; is now ready to store, retrieve, and manage our embeddings, with all the beauty and benefits of MongoDB behind it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating our embedding model
&lt;/h2&gt;

&lt;p&gt;The embedding model is the engine that converts raw text into numerical representations, also known as embeddings. These embeddings are high-dimensional representations of our data that capture the semantic meaning of text, making them the foundation for similarity searches in a retrieval-augmented generation application.&lt;/p&gt;

&lt;p&gt;In this section, we set up an embedding model using OpenAI's &lt;code&gt;text-embedding-ada-002&lt;/code&gt;. To configure the embedding model, we use LangChain4J's &lt;code&gt;OpenAiEmbeddingModel&lt;/code&gt; builder, which abstracts the complexities of interacting with OpenAI's API. Here’s the implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="c1"&gt;// ...&lt;/span&gt;

            &lt;span class="c1"&gt;// Embedding Model setup  &lt;/span&gt;
&lt;span class="nc"&gt;OpenAiEmbeddingModel&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAiEmbeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;  
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OPEN_AI_API_KEY"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAiEmbeddingModelName&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TEXT_EMBEDDING_ADA_002&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;  
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;API key (&lt;code&gt;apiKey&lt;/code&gt;)
The API key provides access to OpenAI’s services. Replace &lt;code&gt;"OPEN_AI_API_KEY"&lt;/code&gt; with your actual OpenAI API key.
&lt;/li&gt;
&lt;li&gt;Model name (&lt;code&gt;modelName&lt;/code&gt;)
We specify &lt;code&gt;OpenAiEmbeddingModelName.TEXT_EMBEDDING_ADA_002&lt;/code&gt;. This model offers:

&lt;ul&gt;
&lt;li&gt;High dimensionality (1536 dimensions): Captures detailed semantic information.
&lt;/li&gt;
&lt;li&gt;General-purpose embeddings: Suitable for a multitude of embedding tasks like document retrieval, clustering, and classification.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This embedding model is critical for generating vector representations of the text data we work with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuring our chat model
&lt;/h2&gt;

&lt;p&gt;In a retrieval-augmented generation application, the chat model serves as the conversational engine. It generates our context-aware, human-like responses based on the user’s query and retrieved content. For this tutorial, we configure a chat model using OpenAI's GPT-4 (other AI models are available), simplified by LangChain4J’s straightforward API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="c1"&gt;// ...&lt;/span&gt;

            &lt;span class="c1"&gt;// Chat Model setup&lt;/span&gt;
            &lt;span class="nc"&gt;ChatLanguageModel&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAiChatModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OPEN_AI_API_KEY"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;modelName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpt-4"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace the API key, just as before. We also specify the model name here.&lt;/p&gt;

&lt;p&gt;The chat model becomes the core of answering user queries in the RAG flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retrieve relevant content: The embedding store retrieves relevant documents based on the user’s query.
&lt;/li&gt;
&lt;li&gt;Generate a response: The chat model uses the retrieved content as context to generate a detailed and accurate answer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For instance, a query like:&lt;/p&gt;

&lt;p&gt;"How does Atlas Vector Search work?"&lt;/p&gt;

&lt;p&gt;Would involve retrieving our related embeddings about Atlas Vector Search from the MongoDB vector store, and then GPT-4 would generate a response using that context.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to load our data
&lt;/h2&gt;

&lt;p&gt;We are going to be loading our data, which we can download from &lt;a href="https://huggingface.co/datasets/MongoDB/devcenter-articles" rel="noopener noreferrer"&gt;MongoDB's Hugging Face&lt;/a&gt;. It is a collection of approximately 600 articles and tutorials from &lt;a href="https://www.mongodb.com/developer/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=How%20to%20make%20a%20RAG%20application%20with%20LangChain4j&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;MongoDB's Developer Center&lt;/a&gt;. We are going to place this &lt;a href="https://huggingface.co/datasets/MongoDB/devcenter-articles/tree/main" rel="noopener noreferrer"&gt;&lt;code&gt;devcenter-content-snapshot.2024-05-20 1.json&lt;/code&gt;&lt;/a&gt; file into the resources folder.&lt;/p&gt;

&lt;p&gt;Now, we need a method &lt;code&gt;loadJsonDocuments&lt;/code&gt; to handle our logic. The &lt;code&gt;loadJsonDocuments&lt;/code&gt; method handles loading and processing the dataset. It reads the JSON file, extracts relevant content (title, body, metadata), and splits it into smaller segments for embedding.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;loadJsonDocuments&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;resourcePath&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;maxTokensPerChunk&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;overlapTokens&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;textSegments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// Load file from resources using the ClassLoader&lt;/span&gt;
    &lt;span class="nc"&gt;InputStream&lt;/span&gt; &lt;span class="n"&gt;inputStream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getClassLoader&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getResourceAsStream&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resourcePath&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputStream&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;FileNotFoundException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Resource not found: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;resourcePath&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Jackson ObjectMapper&lt;/span&gt;
    &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt; &lt;span class="n"&gt;objectMapper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;BufferedReader&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BufferedReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;InputStreamReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputStream&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

    &lt;span class="c1"&gt;// Batch size for processing&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;batchSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Adjust batch size as needed&lt;/span&gt;
    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;

    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readLine&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;JsonNode&lt;/span&gt; &lt;span class="n"&gt;jsonNode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;objectMapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readTree&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jsonNode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asText&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jsonNode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"body"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asText&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;JsonNode&lt;/span&gt; &lt;span class="n"&gt;metadataNode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jsonNode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"metadata"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"\n\n"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;Metadata&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Metadata&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadataNode&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;metadataNode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isObject&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;Iterator&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;fieldNames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadataNode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fieldNames&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fieldNames&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasNext&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;fieldName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fieldNames&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;next&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                    &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fieldName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadataNode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fieldName&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asText&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="c1"&gt;// If batch size is reached, process the batch&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;batchSize&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;textSegments&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;splitIntoChunks&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;maxTokensPerChunk&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;overlapTokens&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
                &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;clear&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Process remaining documents in the last batch&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;textSegments&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;splitIntoChunks&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;maxTokensPerChunk&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;overlapTokens&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;textSegments&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Documents need to be divided into smaller chunks to fit within the token limits of our embedding model. We achieve this using the &lt;code&gt;splitIntoChunks&lt;/code&gt; method. Here, we will use &lt;code&gt;DocumentSplitter&lt;/code&gt;, a tool provided to us by LangChain4j to divide our documents into manageable chunks, while maintaining the original context they provide.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;splitIntoChunks&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;maxTokensPerChunk&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;overlapTokens&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
    &lt;span class="c1"&gt;// Create a tokenizer for OpenAI  &lt;/span&gt;
    &lt;span class="nc"&gt;OpenAiTokenizer&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAiTokenizer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAiEmbeddingModelName&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TEXT_EMBEDDING_ADA_002&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="c1"&gt;// Create a recursive document splitter with the specified token size and overlap  &lt;/span&gt;
    &lt;span class="nc"&gt;DocumentSplitter&lt;/span&gt; &lt;span class="n"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DocumentSplitters&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;recursive&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;  
            &lt;span class="n"&gt;maxTokensPerChunk&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="n"&gt;overlapTokens&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;  
            &lt;span class="n"&gt;tokenizer&lt;/span&gt;  
    &lt;span class="o"&gt;);&lt;/span&gt;  

    &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;allSegments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;  
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;segments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;splitter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
        &lt;span class="n"&gt;allSegments&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;segments&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;allSegments&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Parameters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;maxTokensPerChunk&lt;/code&gt;: Maximum tokens allowed in each segment. This ensures compatibility with the model’s token limit.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;overlapTokens&lt;/code&gt;: Number of overlapping tokens between consecutive chunks. Overlaps help preserve context across segments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, time to add this to our main method. The main method orchestrates the entire process: loading the data, embedding it, and storing it in the embedding store.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="c1"&gt;// ...&lt;/span&gt;

            &lt;span class="c1"&gt;// Load documents&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;resourcePath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"devcenter-content-snapshot.2024-05-20.json"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;TextSegment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loadJsonDocuments&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resourcePath&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Loaded "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;" documents"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;()/&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;TextSegment&lt;/span&gt; &lt;span class="n"&gt;segment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="nc"&gt;Embedding&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embed&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="o"&gt;()).&lt;/span&gt;&lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                &lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Stored embeddings"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I added a few comments here to help us track our progress as we ingest our data. I also adjusted to only intake the first 10% of the documents. When I did this with the entire dataset, it took 30+ minutes to load in all the data on my slow internet. Feel free to adjust this, as the more data ingested, the more potentially accurate the answers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating our content retriever
&lt;/h2&gt;

&lt;p&gt;In our retrieval-augmented generation application, the Content Retriever fetches the most relevant content from a data source based on our user query. LangChain4J provides an abstraction for this, allowing us to connect our embedding store and embedding model to retrieve content.&lt;/p&gt;

&lt;p&gt;We use the &lt;code&gt;EmbeddingStoreContentRetriever&lt;/code&gt; to retrieve content from the embedding store by embedding the user query and finding the most relevant matches.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="c1"&gt;// ...&lt;/span&gt;

            &lt;span class="c1"&gt;// Content Retriever&lt;/span&gt;
            &lt;span class="nc"&gt;ContentRetriever&lt;/span&gt; &lt;span class="n"&gt;contentRetriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingStoreContentRetriever&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingStore&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxResults&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;minScore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.75&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s break down what makes this Content Retriever:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;embeddingStore&lt;/code&gt;: This is our corpus of data we set up earlier. It’s where all the vectorized representations of our documents live.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;embeddingModel&lt;/code&gt;: This is the brains behind the operation. It’s the same model we used to create the embeddings (e.g., &lt;code&gt;text-embedding-ada-002&lt;/code&gt;). By using the same model here, we ensure that the user’s query is embedded in the same "language" as the stored content.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;maxResults&lt;/code&gt;: Setting &lt;code&gt;maxResults&lt;/code&gt; to &lt;code&gt;5&lt;/code&gt; means the retriever will hand us up to five of the most relevant matches for your query.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;minScore&lt;/code&gt;: This is your quality filter. By setting a &lt;code&gt;minScore&lt;/code&gt; of &lt;code&gt;0.75&lt;/code&gt;, we’re saying, "Don’t bother showing me anything that’s not highly relevant." If none of the results meet this threshold, we’ll get an empty list instead of cluttered, irrelevant data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By tweaking these parameters, we can fine-tune how your retriever performs, ensuring it delivers exactly what we need!&lt;/p&gt;

&lt;h2&gt;
  
  
  Asking questions
&lt;/h2&gt;

&lt;p&gt;Time to put the pieces together. We need a way to bring all our components together to query our enhanced LLM. First, we need to create an interface for our assistant. Create an interface, as shown below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;Assistant&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;answer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keeping it very simple, we just want to provide a question and get an answer. Next, we need to create and call our assistant in our main class.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.mongodb&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LangChainRagApp&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="c1"&gt;// ...&lt;/span&gt;

            &lt;span class="c1"&gt;// Assistant&lt;/span&gt;
            &lt;span class="nc"&gt;Assistant&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AiServices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Assistant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;chatLanguageModel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contentRetriever&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contentRetriever&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;answer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"How to use Atlas Triggers and AI to summarise AirBnB reviews?"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;  
            &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printStackTrace&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;  
        &lt;span class="o"&gt;}&lt;/span&gt;  
    &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, in this implementation, I kept it very simple and kept the query in line. There is nothing stopping you from implementing the querying system in an API, or in terminal, or any other way you can imagine. Let's take a look at our reply:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;To summarise Airbnb reviews using MongoDB Atlas Triggers and OpenAI, follow these steps:

1. **Prerequisites**: Set up an App Services application to link to the cluster with Airbnb data. Also, create an OpenAI account with API access.

2. **Set up Secrets and Values**: In App Services, create a secret named `openAIKey` with your OpenAI API key. Then, create a linked value named `OpenAIKey` and link it to the secret.

3. **Trigger code**: The trigger listens for changes in the sample_airbnb.listingsAndReviews collection. When a new review is detected, it samples up to 50 reviews, sends them to OpenAI's API for summarisation, and updates the original document with the summarised content and tags. The trigger reacts to updates that are marked with the `"process" : false` flag, which indicates that a summary hasn't been created for the batch of reviews yet.

4. **Sample Reviews Function**: To avoid overloading the API with too many reviews, a function called `sampleReviews` is defined that randomly samples up to 50 reviews.

5. **API Interaction**: Using the `context.http.post` method, the API request is sent to the OpenAI API.

6. **Updating the Original Document**: Once a successful response from the API is received, the trigger updates the original document with the summarised content, negative tags (neg_tags), positive tags (pos_tags), and a process flag set to true.

7. **Displaying the Data**: Once the data is added to the documents, it can be displayed in a VUE application by adding an HTML template.

By combining MongoDB Atlas triggers with OpenAI's powerful models, large volumes of reviews can be processed and analysed in real-time. This not only provides concise summaries of reviews but also categorises them into positive and negative tags, offering valuable insights to property hosts and potential renters.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a well informed response, and actually references the information available in the original tutorial, &lt;a href="https://www.mongodb.com/developer/products/mongodb/atlas-open-ai-review-summary/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=How%20to%20make%20a%20RAG%20application%20with%20LangChain4j&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Using MongoDB Atlas Triggers to Summarize Airbnb Reviews With OpenAI&lt;/a&gt;. Want the code? Just ask in the query. It will tailor the responses to exactly what you ask!&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;There we have it—we used MongoDB with LangChain4j to create a simple RAG application. LangChain4j abstracted away a lot of the steps along the way, from segmenting our data, to connecting to our MongoDB database and embedding model.&lt;/p&gt;

&lt;p&gt;If you found this tutorial useful, head over to the Developer Center and check out some of our other tutorials, such as &lt;a href="https://www.mongodb.com/developer/languages/java/terraform-springai-rag/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=How%20to%20make%20a%20RAG%20application%20with%20LangChain4j&amp;amp;utm_term=tim.kelly" rel="noopener noreferrer"&gt;Terraforming AI Workflows: RAG With MongoDB Atlas and Spring AI&lt;/a&gt;, or head over to &lt;a href="https://docs.langchain4j.dev/" rel="noopener noreferrer"&gt;LangChain4j&lt;/a&gt; to learn more about what you can do with MongoDB and AI in Java.  &lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>rag</category>
      <category>java</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
