<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Luis Cossío</title>
    <description>The latest articles on Forem by Luis Cossío (@coszio).</description>
    <link>https://forem.com/coszio</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1271141%2F394b2433-d425-45b2-8c5d-a2f25a23be6a.jpeg</url>
      <title>Forem: Luis Cossío</title>
      <link>https://forem.com/coszio</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/coszio"/>
    <language>en</language>
    <item>
      <title>Introducing Gridstore: Qdrant's Custom Key-Value Store</title>
      <dc:creator>Luis Cossío</dc:creator>
      <pubDate>Thu, 06 Feb 2025 12:07:39 +0000</pubDate>
      <link>https://forem.com/qdrant/introducing-gridstore-qdrants-custom-key-value-store-2li9</link>
      <guid>https://forem.com/qdrant/introducing-gridstore-qdrants-custom-key-value-store-2li9</guid>
      <description>&lt;h2&gt;
  
  
  Why We Built Our Own Storage Engine
&lt;/h2&gt;

&lt;p&gt;Databases need a place to store and retrieve data. That’s what Qdrant's &lt;a href="https://en.wikipedia.org/wiki/Key%E2%80%93value_database" rel="noopener noreferrer"&gt;&lt;strong&gt;key-value storage&lt;/strong&gt;&lt;/a&gt; does—it links keys to values.&lt;/p&gt;

&lt;p&gt;When we started building Qdrant, we needed to pick something ready for the task. So we chose &lt;a href="https://rocksdb.org" rel="noopener noreferrer"&gt;&lt;strong&gt;RocksDB&lt;/strong&gt;&lt;/a&gt; as our embedded key-value store.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwriweh16glaxmgrwt8m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwriweh16glaxmgrwt8m.png" alt="Image description" width="730" height="213"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Over time, we ran into issues. Its architecture required compaction (uses &lt;a href="https://en.wikipedia.org/wiki/Log-structured_merge-tree" rel="noopener noreferrer"&gt;LSMT&lt;/a&gt;), which caused random latency spikes. It handles generic keys, while we only use it for sequential IDs. Having lots of configuration options makes it versatile, but accurately tuning it was a headache. Finally, interoperating with C++ slowed us down (although we will still support it for quite some time 😭).&lt;br&gt;
While there are already some good options written in Rust that we could leverage, we needed something custom. Nothing out there fit our needs in the way we wanted. We didn’t require generic keys. We wanted full control over when and which data was written and flushed. Our system already has crash recovery mechanisms built-in. Online compaction isn’t a priority, we already have optimizers for that. Debugging misconfigurations was not a great use of our time.&lt;br&gt;
So we built our own storage. As of &lt;a href="///blog/qdrant-1.13.x/"&gt;&lt;strong&gt;Qdrant Version 1.13&lt;/strong&gt;&lt;/a&gt;, we are using Gridstore for &lt;strong&gt;payload and sparse vector storages&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqg5t7nghedw9zqxlv03l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqg5t7nghedw9zqxlv03l.png" alt="Image description" width="800" height="200"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  In this article, you’ll learn about:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How Gridstore works&lt;/strong&gt; – a deep dive into its architecture and mechanics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why we built it this way&lt;/strong&gt; – the key design decisions that shaped it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rigorous testing&lt;/strong&gt; – how we ensured the new storage is production-ready.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance benchmarks&lt;/strong&gt; – official metrics that demonstrate its efficiency.
&lt;strong&gt;Our first challenge?&lt;/strong&gt; Figuring out the best way to handle sequential keys and variable-sized data.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Gridstore Architecture: Three Main Components
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fng6hr3zi52diiqwdx4sj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fng6hr3zi52diiqwdx4sj.png" alt="Image description" width="800" height="113"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Layer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stores values in fixed-sized blocks and retrieves them using a pointer-based lookup system for efficient access.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mask Layer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Maintains a bitmask to track block usage, distinguishing between allocated and available blocks.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gaps Layer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manages block availability at a higher level, optimizing space allocation and reuse.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  1. The Data Layer for Fast Retrieval
&lt;/h3&gt;

&lt;p&gt;At the core of Gridstore is &lt;strong&gt;The Data Layer&lt;/strong&gt;, which is designed to store and retrieve values quickly based on their keys. This layer allows us to do efficient reads and lets us store variable-sized data. The main two components of this layer are &lt;strong&gt;The Tracker&lt;/strong&gt; and &lt;strong&gt;The Data Grid&lt;/strong&gt;.&lt;br&gt;
Since internal IDs are always sequential integers (0, 1, 2, 3, 4, ...), the tracker is an array of pointers, where each pointer tells the system exactly where a value starts and how long it is. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko72b7r8luo39rj2apfo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko72b7r8luo39rj2apfo.png" alt="Image description" width="800" height="271"&gt;&lt;/a&gt;&lt;br&gt;
The Data Layer uses an array of pointers to quickly retrieve data.&lt;/p&gt;

&lt;p&gt;This makes lookups incredibly fast. For example, finding key 3 is just a matter of jumping to the third position in the tracker, and following the pointer to find the value in the data grid. &lt;/p&gt;

&lt;p&gt;However, because values are of variable size, the data itself is stored separately in a grid of fixed-sized blocks, which are grouped into larger page files. The fixed size of each block is usually 128 bytes. When inserting a value, Gridstore allocates one or more consecutive blocks to store it, ensuring that each block only holds data from a single value.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. The Mask Layer Reuses Space
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Mask Layer&lt;/strong&gt; helps Gridstore handle updates and deletions without the need for expensive data compaction. Instead of maintaining complex metadata for each block, Gridstore tracks usage with a bitmask, where each bit represents a block, with 1 for used, 0 for free.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2hhaftlk0xddax0lcwu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2hhaftlk0xddax0lcwu.png" alt="Image description" width="800" height="278"&gt;&lt;/a&gt;&lt;br&gt;
The bitmask efficiently tracks block usage.&lt;/p&gt;

&lt;p&gt;This makes it easy to determine where new values can be written. When a value is removed, it gets soft-deleted at its pointer, and the corresponding blocks in the bitmask are marked as available. Similarly, when updating a value, the new version is written elsewhere, and the old blocks are freed at the bitmask.&lt;br&gt;
This approach ensures that Gridstore doesn’t waste space. As the storage grows, however, scanning for available blocks in the entire bitmask can become computationally expensive.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. The Gaps Layer for Effective Updates
&lt;/h3&gt;

&lt;p&gt;To further optimize update handling, Gridstore introduces &lt;strong&gt;The Gaps Layer&lt;/strong&gt;, which provides a higher-level view of block availability. &lt;br&gt;
Instead of scanning the entire bitmask, Gridstore splits the bitmask into regions and keeps track of the largest contiguous free space within each region, known as &lt;strong&gt;The Region Gap&lt;/strong&gt;. By also storing the leading and trailing gaps of each region, the system can efficiently combine multiple regions when needed for storing large values.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wski880wmo2ed91syno.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wski880wmo2ed91syno.png" alt="Image description" width="800" height="563"&gt;&lt;/a&gt;&lt;br&gt;
The complete architecture of Gridstore&lt;/p&gt;

&lt;p&gt;This layered approach allows Gridstore to locate available space quickly, scaling down the work required for scans while keeping memory overhead minimal. With this system, finding storage space for new values requires scanning only a tiny fraction of the total metadata, making updates and insertions highly efficient, even in large segments.&lt;br&gt;
Given the default configuration, the gaps layer is scoped out in a millionth fraction of the actual storage size. This means that for each 1GB of data, the gaps layer only requires scanning 6KB of metadata. With this mechanism, the other operations can be executed in virtually constant-time complexity.&lt;/p&gt;
&lt;h2&gt;
  
  
  Gridstore in Production: Maintaining Data Integrity
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fahj36tq3wy0484431sdx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fahj36tq3wy0484431sdx.png" alt="Image description" width="800" height="118"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Gridstore’s architecture introduces multiple interdependent structures that must remain in sync to ensure data integrity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Data Layer&lt;/strong&gt; holds the data and associates each key with its location in storage, including page ID, block offset, and the size of its value.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Mask Layer&lt;/strong&gt; keeps track of which blocks are occupied and which are free.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Gaps Layer&lt;/strong&gt; provides an indexed view of free blocks for efficient space allocation.
Every time a new value is inserted or an existing value is updated, all these components need to be modified in a coordinated way.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  When Things Break in Real Life
&lt;/h3&gt;

&lt;p&gt;Real-world systems don’t operate in a vacuum. Failures happen: software bugs cause unexpected crashes, memory exhaustion forces processes to terminate, disks fail to persist data reliably, and power losses can interrupt operations at any moment. &lt;br&gt;
&lt;em&gt;The critical question is: what happens if a failure occurs while updating these structures?&lt;/em&gt;&lt;br&gt;
If one component is updated but another isn’t, the entire system could become inconsistent. Worse, if an operation is only partially written to disk, it could lead to orphaned data, unusable space, or even data corruption.&lt;/p&gt;
&lt;h3&gt;
  
  
  Stability Through Idempotency: Recovering With WAL
&lt;/h3&gt;

&lt;p&gt;To guard against these risks, Qdrant relies on a &lt;a href="https://dev.to/documentation/concepts/storage/"&gt;&lt;strong&gt;Write-Ahead Log (WAL)&lt;/strong&gt;&lt;/a&gt;. Before committing an operation, Qdrant ensures that it is at least recorded in the WAL. If a crash happens before all updates are flushed, the system can safely replay operations from the log. &lt;/p&gt;

&lt;p&gt;This recovery mechanism introduces another essential property: &lt;a href="https://en.wikipedia.org/wiki/Idempotence" rel="noopener noreferrer"&gt;&lt;strong&gt;idempotence&lt;/strong&gt;&lt;/a&gt;. &lt;br&gt;
The storage system must be designed so that reapplying the same operation after a failure leads to the same final state as if the operation had been applied just once.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Grand Solution: Lazy Updates
&lt;/h3&gt;

&lt;p&gt;To achieve this, &lt;strong&gt;Gridstore completes updates lazily&lt;/strong&gt;, prioritizing the most critical part of the write: the data itself. &lt;/p&gt;

&lt;p&gt;👉 Instead of immediately updating all metadata structures, it writes the new value first while keeping lightweight pending changes in a buffer. &lt;br&gt;
👉 The system only finalizes these updates when explicitly requested, ensuring that a crash never results in marking data as deleted before the update has been safely persisted. &lt;br&gt;
👉 In the worst-case scenario, Gridstore may need to write the same data twice, leading to a minor space overhead, but it will never corrupt the storage by overwriting valid data. &lt;/p&gt;
&lt;h2&gt;
  
  
  How We Tested the Final Product
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsy8rcpa29k45ja0d5u8n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsy8rcpa29k45ja0d5u8n.png" alt="Image description" width="800" height="110"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  First... Model Testing
&lt;/h3&gt;

&lt;p&gt;Gridstore can be tested efficiently using model testing, which compares its behavior to a simple in-memory hash map. Since Gridstore should function like a persisted hash map, this method quickly detects inconsistencies.&lt;/p&gt;

&lt;p&gt;The process is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Initialize a Gridstore instance and an empty hash map.&lt;/li&gt;
&lt;li&gt;Run random operations (put, delete, update) on both.&lt;/li&gt;
&lt;li&gt;Verify that results match after each operation.&lt;/li&gt;
&lt;li&gt;Compare all keys and values to ensure consistency.
This approach provides high test coverage, exposing issues like incorrect persistence or faulty deletions. Running large-scale model tests ensures Gridstore remains reliable in real-world use.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here is a naive way to generate operations in Rust.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;Operation&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;Put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PointOffset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Payload&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;Delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PointOffset&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;Update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PointOffset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Payload&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Operation&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Rng&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_point_offset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;Self&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;point_offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="nf"&gt;.random_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;..=&lt;/span&gt;&lt;span class="n"&gt;max_point_offset&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="nf"&gt;.gen_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;operation&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;size_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="nf"&gt;.random_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;random_payload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size_factor&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nn"&gt;Operation&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;point_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nn"&gt;Operation&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;point_offset&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;size_factor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="nf"&gt;.random_range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;random_payload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size_factor&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nn"&gt;Operation&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;point_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nd"&gt;unreachable!&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Model testing is a high-value way to catch bugs, especially when your system mimics a well-defined component like a hash map. If your component behaves the same as another one, using model testing brings a lot of value for a bit of effort.&lt;/p&gt;

&lt;p&gt;We could have tested against RocksDB, but simplicity matters more. A simple hash map lets us run massive test sequences quickly, exposing issues faster.&lt;/p&gt;

&lt;p&gt;For even sharper debugging, Property-Based Testing adds automated test generation and shrinking. It pinpoints failures with minimalized test cases, making bug hunting faster and more effective.&lt;/p&gt;

&lt;h3&gt;
  
  
  Crash Testing: Can Gridstore Handle the Pressure?
&lt;/h3&gt;

&lt;p&gt;Designing for crash resilience is one thing, and proving it works under stress is another. To push Qdrant’s data integrity to the limit, we built &lt;a href="https://github.com/qdrant/crasher" rel="noopener noreferrer"&gt;&lt;strong&gt;Crasher&lt;/strong&gt;&lt;/a&gt;, a test bench that brutally kills and restarts Qdrant while it handles a heavy update workload.&lt;/p&gt;

&lt;p&gt;Crasher runs a loop that continuously writes data, then randomly crashes Qdrant. On each restart, Qdrant replays its &lt;a href="https://dev.to/documentation/concepts/storage/"&gt;&lt;strong&gt;Write-Ahead Log (WAL)&lt;/strong&gt;&lt;/a&gt;, and we verify if data integrity holds. Possible anomalies include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Missing data (points, vectors, or payloads)&lt;/li&gt;
&lt;li&gt;Corrupt payload values
This aggressive yet simple approach has uncovered real-world issues when run for extended periods. While we also use chaos testing for distributed setups, Crasher excels at fast, repeatable failure testing in a local environment.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Testing Gridstore Performance: Benchmarks
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83z3krx9wuz8xosjmctg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83z3krx9wuz8xosjmctg.png" alt="Image description" width="800" height="111"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To measure the impact of our new storage engine, we used &lt;a href="https://github.com/jonhoo/bustle" rel="noopener noreferrer"&gt;&lt;strong&gt;Bustle, a key-value storage benchmarking framework&lt;/strong&gt;&lt;/a&gt;, to compare Gridstore against RocksDB. We tested three workloads:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workload Type&lt;/th&gt;
&lt;th&gt;Operation Distribution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Read-heavy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;95% reads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Insert-heavy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;80% inserts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Update-heavy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;50% updates&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  The results speak for themselves:
&lt;/h4&gt;

&lt;p&gt;Average latency for all kinds of workloads is lower across the board, particularly for inserts. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07ww9g05p05zru1jaqb5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07ww9g05p05zru1jaqb5.png" alt="Image description" width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This shows a clear boost in performance. As we can see, the investment in Gridstore is paying off.&lt;/p&gt;

&lt;h3&gt;
  
  
  End-to-End Benchmarking
&lt;/h3&gt;

&lt;p&gt;Now, let’s test the impact on a real Qdrant instance. So far, we’ve only integrated Gridstore for &lt;a href="https://dev.to/documentation/concepts/payload/"&gt;&lt;strong&gt;payloads&lt;/strong&gt;&lt;/a&gt; and &lt;a href="https://dev.to/documentation/concepts/vectors/#sparse-vectors"&gt;&lt;strong&gt;sparse vectors&lt;/strong&gt;&lt;/a&gt;, but even this partial switch should show noticeable improvements.&lt;/p&gt;

&lt;p&gt;For benchmarking, we used our in-house &lt;a href="https://github.com/qdrant/bfb" rel="noopener noreferrer"&gt;&lt;strong&gt;bfb tool&lt;/strong&gt;&lt;/a&gt; to generate a workload. Our configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;bfb&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-n&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2000000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;--max-id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--sparse-vectors&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--set-payload&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--on-disk-payload&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--dim&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--sparse-dim&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--bool-payloads&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--keywords&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--float-payloads&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--int-payloads&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--text-payloads&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--text-payload-length&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--skip-field-indices&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;--jsonl-updates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;./rps.jsonl&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This benchmark upserts 1 million points twice. Each point has: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A medium to large payload&lt;/li&gt;
&lt;li&gt;A tiny dense vector (dense vectors use a different storage type)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - A sparse vector
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Additional configuration:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;The test we conducted updated payload data separately in another request. &lt;/li&gt;
&lt;li&gt;There were no payload indices, which ensured we measured pure ingestion speed.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  3. Finally, we gathered request latency metrics for analysis.
&lt;/h2&gt;

&lt;p&gt;We ran this against Qdrant 1.12.6, toggling between the old and new storage backends. &lt;/p&gt;

&lt;h3&gt;
  
  
  Final Result
&lt;/h3&gt;

&lt;p&gt;Data ingestion is &lt;strong&gt;twice as fast and with a smoother throughput&lt;/strong&gt; — a massive win! 😍&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frxb9k7bspj5ne9895c7g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frxb9k7bspj5ne9895c7g.png" alt="Image description" width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We optimized for speed, and it paid off—but what about storage size?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gridstore: 2333MB&lt;/li&gt;
&lt;li&gt;RocksDB: 2319MB
Strictly speaking, RocksDB is slightly smaller, but the difference is negligible compared to the 2x faster ingestion and more stable throughput. A small trade-off for a big performance gain! &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Trying Out Gridstore
&lt;/h2&gt;

&lt;p&gt;Gridstore represents a significant advancement in how Qdrant manages its &lt;strong&gt;key-value storage&lt;/strong&gt; needs. It offers great performance and streamlined updates tailored specifically for our use case. We have managed to achieve faster, more reliable data ingestion while maintaining data integrity, even under heavy workloads and unexpected failures. It is already used as a storage backend for on-disk payloads and sparse vectors.&lt;/p&gt;

&lt;p&gt;👉 It’s important to note that Gridstore remains tightly integrated with Qdrant and, as such, has not been released as a standalone crate. &lt;br&gt;
Its API is still evolving, and we are focused on refining it within our ecosystem to ensure maximum stability and performance. That said, we recognize the value this innovation could bring to the wider Rust community. In the future, once the API stabilizes and we decouple it enough from Qdrant, we will consider publishing it as a contribution to the community ❤️.&lt;/p&gt;

&lt;p&gt;For now, Gridstore continues to drive improvements in Qdrant, demonstrating the benefits of a custom-tailored storage engine designed with modern demands in mind. Stay tuned for further updates and potential community releases as we keep pushing the boundaries of performance and reliability.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwbbltk1kbymxiqexsku.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwbbltk1kbymxiqexsku.png" alt="Image description" width="800" height="200"&gt;&lt;/a&gt;&lt;br&gt;
Simple, efficient, and designed just for Qdrant.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>opensource</category>
      <category>database</category>
    </item>
    <item>
      <title>Discovery needs context</title>
      <dc:creator>Luis Cossío</dc:creator>
      <pubDate>Thu, 01 Feb 2024 13:55:36 +0000</pubDate>
      <link>https://forem.com/qdrant/discovery-needs-context-45l</link>
      <guid>https://forem.com/qdrant/discovery-needs-context-45l</guid>
      <description>&lt;p&gt;When Christopher Columbus and his crew sailed to cross the Atlantic Ocean, they were not looking for America. They were looking for a new route to India, and they were convinced that the Earth was round. They didn’t know anything about America, but since they were going west, they stumbled upon it.&lt;/p&gt;

&lt;p&gt;They couldn’t reach their target, because the geography didn’t let them, but once they realized it wasn’t India, they claimed it a new “discovery” for their crown. If we consider that sailors need water to sail, then we can establish a context which is positive in the water, and negative on land. Once the sailor’s search was stopped by the land, they could not go any further, and a new route was found. Let’s keep this concepts of target and context in mind as we explore the new functionality of Qdrant: &lt;strong&gt;Discovery search.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In version 1.7, Qdrant &lt;a href="https://qdrant.tech/articles/qdrant-1.7.x/"&gt;released&lt;/a&gt; this novel API that lets you constrain the space in which a search is performed, relying only on pure vectors. This is a powerful tool that lets you explore the vector space in a more controlled way. It can be used to find points that are not necessarily closest to the target, but are still relevant to the search.&lt;/p&gt;

&lt;p&gt;You can already select which points are available to the search by using payload filters. This by itself is very versatile because it allows us to craft complex filters that show only the points that satisfy their criteria deterministically. However, the payload associated with each point is arbitrary and cannot tell us anything about their position in the vector space. In other words, filtering out irrelevant points can be seen as creating a mask rather than a hyperplane –cutting in between the positive and negative vectors– in the space.&lt;/p&gt;

&lt;p&gt;This is where a &lt;strong&gt;vector&lt;/strong&gt; &lt;em&gt;context&lt;/em&gt; can help. We define context as a list of pairs. Each pair is made up of a positive and a negative vector. With a context, we can define hyperplanes within the vector space, which always prefer the positive over the negative vectors. This effectively partitions the space where the search is performed. After the space is partitioned, we then need a target to return the points that are more similar to it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs03wz2e4vepw5ng9zy9d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs03wz2e4vepw5ng9zy9d.png" alt="" width="720" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While positive and negative vectors might suggest the use of the &lt;a href="https://qdrant.tech/documentation/concepts/explore/#recommendation-api"&gt;recommendation interface&lt;/a&gt;, in the case of context they require to be paired up in a positive-negative fashion. This is inspired from the machine-learning concept of &lt;em&gt;&lt;a href="https://en.wikipedia.org/wiki/Triplet_loss"&gt;triplet loss&lt;/a&gt;&lt;/em&gt;, where you have three vectors: an anchor, a positive, and a negative. Triplet loss is an evaluation of how much the anchor is closer to the positive than to the negative vector, so that learning happens by “moving” the positive and negative points to try to get a better evaluation. However, during discovery, we consider the positive and negative vectors as static points, and we search through the whole dataset for the “anchors”, or result cantidates, which fit this characteristic better.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff68nwfa5m6wvfvb7pb5s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff68nwfa5m6wvfvb7pb5s.png" alt="" width="720" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://qdrant.tech/articles/discovery-search/#discovery-search"&gt;Discovery search&lt;/a&gt;, then, is made up of two main inputs:&lt;/p&gt;

&lt;p&gt;target: the main point of interest&lt;br&gt;
context: the pairs of positive and negative points we just defined.&lt;br&gt;
However, it is not the only way to use it. Alternatively, you can &lt;strong&gt;only&lt;/strong&gt; provide a context, which invokes a Context Search. This is useful when you want to explore the space defined by the context, but don’t have a specific target in mind. But hold your horses, we’ll get to that later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Discovery search
&lt;/h3&gt;

&lt;p&gt;Let’s talk about the first case: context with a target.&lt;/p&gt;

&lt;p&gt;To understand why this is useful, let’s take a look at a real-world example: using a multimodal encoder like CLIP to search for images, from text and images. &lt;a href="https://openai.com/blog/clip/"&gt;CLIP&lt;/a&gt; is a neural network that can embed both images &lt;strong&gt;and&lt;/strong&gt; text into the same vector space. This means that you can search for images using either a text query or an image query. For this example, we’ll reuse our &lt;a href="https://food-discovery.qdrant.tech/"&gt;food recommendations&lt;/a&gt; demo by typing “burger” in the text input:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fen3iop9l6tywisgdiol4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fen3iop9l6tywisgdiol4.png" alt="" width="720" height="539"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is basically nearest neighbor search, and while technically we have only images of burgers, one of them is a logo representation of a burger. We’re looking for actual burgers, though. Let’s try to exclude images like that by adding it as a negative example:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fntn6r1ccgmrgdcqeeeu5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fntn6r1ccgmrgdcqeeeu5.png" alt="" width="720" height="569"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Wait a second, what has just happened? These pictures have nothing to do with burgers, and still, they appear on the first results. Is the demo broken?&lt;/p&gt;

&lt;p&gt;Turns out, multimodal encoders &lt;a href="https://modalitygap.readthedocs.io/en/latest/"&gt;might not work how you expect them to&lt;/a&gt;. Images and text are embedded in the same space, but they are not necessarily close to each other. This means that we can create a mental model of the distribution as two separate planes, one for images and one for text.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn7c1bl5a647x2wz734l1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn7c1bl5a647x2wz734l1.png" alt="" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is where discovery excels, because it allows us to constrain the space considering the same mode (images) while using a target from the other mode (text).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgpokfyopr2q7wr3amym1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgpokfyopr2q7wr3amym1.png" alt="" width="720" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Discovery also lets us keep giving feedback to the search engine in the shape of more context pairs, so we can keep refining our search until we find what we are looking for.&lt;/p&gt;

&lt;p&gt;Another intuitive example: imagine you’re looking for a fish pizza, but pizza names can be confusing, so you can just type “pizza”, and prefer a fish over meat. Discovery search will let you use these inputs to suggest a fish pizza… even if it’s not called fish pizza!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fggtp32bkem5lzs97sbda.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fggtp32bkem5lzs97sbda.png" alt="" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Context search
&lt;/h3&gt;

&lt;p&gt;Now, second case: only providing context.&lt;/p&gt;

&lt;p&gt;Ever been caught in the same recommendations on your favourite music streaming service? This may be caused by getting stuck in a similarity bubble. As user input gets more complex, diversity becomes scarce, and it becomes harder to force the system to recommend something different.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsgregvggwx3xksy005xd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsgregvggwx3xksy005xd.png" alt="" width="720" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context search&lt;/strong&gt; solves this by de-focusing the search around a single point. Instead, it selects points randomly from within a zone in the vector space. This search is the most influenced by triplet loss, as the score can be thought of as “how much a point is closer to a negative than a positive vector?”. If it is closer to the positive one, then its score will be zero, same as any other point within the same zone. But if it is on the negative side, it will be assigned a more and more negative score the further it gets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9gfbk52ed1zmc5bc1wna.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9gfbk52ed1zmc5bc1wna.png" alt="" width="720" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Creating complex tastes in a high-dimensional space becomes easier, since you can just add more context pairs to the search. This way, you should be able to constrain the space enough so you select points from a per-search “category” created just from the context in the input.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8za3avfosqsiao463nl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft8za3avfosqsiao463nl.png" alt="" width="720" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This way you can give refeshing recommendations, while still being in control by providing positive and negative feedback, or even by trying out different permutations of pairs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wrapping up
&lt;/h3&gt;

&lt;p&gt;Discovery search is a powerful tool that lets you explore the vector space in a more controlled way. It can be used to find points that are not necessarily close to the target, but are still relevant to the search. It can also be used to represent complex tastes, and break out of the similarity bubble. Check out the &lt;a href="https://qdrant.tech/documentation/concepts/explore/#discovery-api"&gt;documentation&lt;/a&gt; to learn more about the math behind it and how to use it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>database</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
