<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Yefu Wang</title>
    <description>The latest articles on Forem by Yefu Wang (@yefuwang).</description>
    <link>https://forem.com/yefuwang</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1766430%2F74a6918b-7199-444b-8611-e5608480f7c5.png</url>
      <title>Forem: Yefu Wang</title>
      <link>https://forem.com/yefuwang</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yefuwang"/>
    <language>en</language>
    <item>
      <title>Introducing vectorlite: A Fast and Tunable Vector Search Extension for SQLite</title>
      <dc:creator>Yefu Wang</dc:creator>
      <pubDate>Thu, 18 Jul 2024 15:47:53 +0000</pubDate>
      <link>https://forem.com/yefuwang/introducing-vectorlite-a-fast-and-tunable-vector-search-extension-for-sqlite-4dcl</link>
      <guid>https://forem.com/yefuwang/introducing-vectorlite-a-fast-and-tunable-vector-search-extension-for-sqlite-4dcl</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;With the rise of LLMs(Large Language Models) and RAG(Retrieval-Augmented Generation), vector databases, like &lt;a href="https://milvus.io/" rel="noopener noreferrer"&gt;Milvus&lt;/a&gt; and &lt;a href="https://www.pinecone.io/" rel="noopener noreferrer"&gt;Pinecone&lt;/a&gt;, are getting a lot of attention. Traditional databases are also catching up in vector search support via third-party extensions, such as &lt;a href="https://github.com/pgvector/pgvector" rel="noopener noreferrer"&gt;pgvector&lt;/a&gt; for PostgreSQL and &lt;a href="https://github.com/asg017/sqlite-vss" rel="noopener noreferrer"&gt;sqlite-vss&lt;/a&gt; for SQLite.&lt;/p&gt;

&lt;p&gt;In this article, I'm introducing &lt;a href="https://github.com/1yefuwang1/vectorlite" rel="noopener noreferrer"&gt;vectorlite&lt;/a&gt;, yet another fast and tunable vector search extension I write for SQLite that has just made its first beta release. It can now be installed using pip&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vectorlite-py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A SQLite extension is a dynamic library that can be loaded by SQLite at runtime. Below is an example of using vectorlite in the SQLite CLI shell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Load vectorlite&lt;/span&gt;
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;load&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;vectorlite&lt;/span&gt;&lt;span class="p"&gt;.[&lt;/span&gt;&lt;span class="n"&gt;so&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;dll&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;dylib&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;-- shows vectorlite version and build info.&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;vectorlite_info&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; 
&lt;span class="c1"&gt;-- Calculate vector l2(squared) distance&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;vector_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[1,2,3]'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;vector_from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[3,4,5]'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'l2'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Create a virtual table named my_table with one vector column my_embedding with a dimention of 3 using default HNSW parameters.&lt;/span&gt;
&lt;span class="k"&gt;create&lt;/span&gt; &lt;span class="n"&gt;virtual&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;vectorlite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_embedding&lt;/span&gt; &lt;span class="n"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;hnsw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_elements&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="c1"&gt;-- Insert vectors into my_table. rowid can be used to relate to a vector's metadata stored elsewhere, e.g. another table.&lt;/span&gt;
&lt;span class="k"&gt;insert&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rowid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;values&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vector_from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[1,2,3]'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;insert&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rowid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;values&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vector_from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[2,3,4]'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;insert&lt;/span&gt; &lt;span class="k"&gt;into&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rowid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;values&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vector_from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[7,7,7]'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="c1"&gt;-- Find 2 approximate nearest neighbors of vector [3,4,5] with distances&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;rowid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;knn_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[3,4,5]'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="c1"&gt;-- Find the nearest neighbor of vector [3,4,5] among vectors with rowid 0 and 1. (requires sqlite_version&amp;gt;=3.38)&lt;/span&gt;
&lt;span class="c1"&gt;-- It is called metadata filter in vectorlite, because you could get rowid set beforehand based on vectors' metadata and then perform vector search.&lt;/span&gt;
&lt;span class="c1"&gt;-- Metadata filter is pushed down to the underlying index when traversing the HNSW graph.&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;rowid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;knn_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;knn_param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_from_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'[3,4,5]'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;rowid&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For full API reference, please check &lt;a href="https://github.com/1yefuwang1/vectorlite?tab=readme-ov-file#api-reference" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Highlights
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Fast ANN(Approximate Nearest Neighbors) search backed by &lt;a href="https://github.com/nmslib/hnswlib" rel="noopener noreferrer"&gt;hnswlib&lt;/a&gt;. Please see &lt;a href="https://github.com/1yefuwang1/vectorlite?tab=readme-ov-file#benchmark" rel="noopener noreferrer"&gt;benchmark&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Works on Windows, Linux and MacOS.&lt;/li&gt;
&lt;li&gt;SIMD accelerated vector distance calculation for x86 platform, using &lt;code&gt;vector_distance()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Supports all vector distance types provided by hnswlib: l2(squared l2), cosine, ip(inner product. I do not recomend you to use it though). For more info please check &lt;a href="https://github.com/nmslib/hnswlib/tree/v0.8.0?tab=readme-ov-file#supported-distances" rel="noopener noreferrer"&gt;hnswlib's doc&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Full control over &lt;a href="https://github.com/nmslib/hnswlib/blob/v0.8.0/ALGO_PARAMS.md" rel="noopener noreferrer"&gt;HNSW parameters&lt;/a&gt; for performance tuning. Please check &lt;a href="https://github.com/1yefuwang1/vectorlite/blob/v0.1.0/examples/hnsw_param.py" rel="noopener noreferrer"&gt;this example&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Predicate pushdown support for vector metadata filter (requires sqlite version &amp;gt;= 3.38). Please check &lt;a href="https://github.com/1yefuwang1/vectorlite/blob/v0.1.0/examples/metadata_filter.py" rel="noopener noreferrer"&gt;this example&lt;/a&gt;;&lt;/li&gt;
&lt;li&gt;Index serde support. A vectorlite table can be saved to a file, and be reloaded from it. Index files created by hnswlib can also be loaded by vectorlite. Please check &lt;a href="https://github.com/1yefuwang1/vectorlite/blob/v0.1.0/examples/index_serde.py" rel="noopener noreferrer"&gt;this example&lt;/a&gt;;&lt;/li&gt;
&lt;li&gt;Vector json serde support using &lt;code&gt;vector_from_json()&lt;/code&gt; and &lt;code&gt;vector_to_json()&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h1&gt;
  
  
  Why another vector search extension for SQLite?
&lt;/h1&gt;

&lt;p&gt;First of all, SQLite is valuable tool for LLM-powered apps because it is lightweight and stores data locally, making your data much safer.&lt;/p&gt;

&lt;p&gt;Though there's already sqlite-vss that enables vector search for SQLite, some of its technical decisions are worth debating, which is why I write vectorlite. The author of sqlite-vss has moved on to another vector search extension project and wrote an &lt;a href="https://alexgarcia.xyz/blog/2024/building-new-vector-search-sqlite/index.html" rel="noopener noreferrer"&gt;article&lt;/a&gt; explaining what's wrong with sqlite-vss from his point of view. I'll share mine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The choice of vector search library
&lt;/h2&gt;

&lt;p&gt;Sqlite-vss uses &lt;a href="https://github.com/facebookresearch/faiss" rel="noopener noreferrer"&gt;faiss&lt;/a&gt; to do vector seaching. It is a great library opensourced by Meta(facebook) and provides a wide range of algorithms for vector search. However, it is optimized for batch operations over a large dataset, making it &lt;a href="https://github.com/facebookresearch/faiss/wiki/FAQ#why-are-searches-with-a-single-query-vector-slow" rel="noopener noreferrer"&gt;slow for a single vector query&lt;/a&gt; and &lt;a href="https://github.com/facebookresearch/faiss/issues/44" rel="noopener noreferrer"&gt;incremental indexing&lt;/a&gt; on CPU. However, SQLite's &lt;a href="https://www.sqlite.org/vtab.html" rel="noopener noreferrer"&gt;extensibility model&lt;/a&gt; (called virtual table) doesn't provide APIs for batch operations and only exposes API to insert/update/delete a single row at a time. Besides, sqlite-vss only support single-vector search, which faiss is not good at. As a result, sqlite-vss can't fully exploit faiss's performance.&lt;/p&gt;

&lt;p&gt;Vectorlite uses &lt;a href="https://github.com/nmslib/hnswlib" rel="noopener noreferrer"&gt;hnswlib&lt;/a&gt; that provides a fast implementation of HNSW algorithm and is optimized for incremental index construction and single-vector queries, which plays well with SQLite's virtual table API.&lt;/p&gt;

&lt;p&gt;In my &lt;a href="https://github.com/1yefuwang1/vectorlite?tab=readme-ov-file#benchmark" rel="noopener noreferrer"&gt;benchmark&lt;/a&gt;, compared with sqlite-vss, vectorlite is 10x faster in inserting vectors and 2x-40x faster in searching (depending on HNSW parameters with speed-accuracy tradeoff), which is mainly due to different choices of vector search library.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vector metadata filter
&lt;/h2&gt;

&lt;p&gt;Another important feature that &lt;a href="https://github.com/asg017/sqlite-vss/issues/2" rel="noopener noreferrer"&gt;sqlite-vss misses&lt;/a&gt; is vector metadata filtering. In real word scenarios, a vector is always tagged with some metadata including dates, genres, categories, names, types, contents, etc. There's no doubt that filtering a vector based on its tags before performing vector search is a must-have feature for a functional vector database.&lt;/p&gt;

&lt;p&gt;Vectorlite supports vector metadata filter (requires sqlite version &amp;gt;= 3.38) since first release. Please check this &lt;a href="https://github.com/1yefuwang1/vectorlite/blob/v0.1.0/examples/metadata_filter.py" rel="noopener noreferrer"&gt;example&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance tuning for production deployment
&lt;/h2&gt;

&lt;p&gt;Both sqlite-vss and vectorlite leverage ANN(Approximate Nearest Neighbors) alogorithms for vector search, meaning that the result is not guaranteed to be 100% correct but the search is relatively fast. In prodcution, the performance of ANN alogrithms need to be benchmarked and tuned based on your workload, embeddings and specific needs. For example, one may need fast vector query at the cost of lower recall rate because they are building a realtime semantic search app, while another may need high recall rate but doesn't care search speed because they are doing some offline analysis.&lt;/p&gt;

&lt;p&gt;In a typical recommendation system, its vector index is often built and tuned offline using libraries like faiss and hnswlib for acceptable speed and accuracy, and then served in production.&lt;/p&gt;

&lt;p&gt;With sqlite-vss, you &lt;strong&gt;have to&lt;/strong&gt; build vector index using it, making performance tuning very difficult. One cannot build the vector index offline with faiss and then serve the index with sqlite-vss. Though you do get to pass faiss strings to it to tune the index, non-default indexes are so slow that they barely work (probably due to the nature of the algorithm or improper use of faiss I mentioned above). &lt;/p&gt;

&lt;p&gt;With vectorlite, you have full control over the HNSW paramters to tune the search speed and recall rate. Please check this &lt;a href="https://github.com/1yefuwang1/vectorlite/tree/v0.1.0/benchmark" rel="noopener noreferrer"&gt;example&lt;/a&gt; and the result.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┏━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━┳━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┓
┃ distance ┃ vector    ┃ ef           ┃    ┃ ef     ┃ insert_time ┃ search_time ┃ recall ┃
┃ type     ┃ dimension ┃ construction ┃ M  ┃ search ┃ per vector  ┃ per query   ┃ rate   ┃
┡━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━╇━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━┩
│ l2       │ 256       │ 200          │ 32 │ 10     │ 291.13 us   │ 35.70 us    │ 31.60% │
│ l2       │ 256       │ 200          │ 32 │ 50     │ 291.13 us   │ 99.50 us    │ 72.30% │
│ l2       │ 256       │ 200          │ 32 │ 100    │ 291.13 us   │ 168.80 us   │ 88.60% │
│ l2       │ 256       │ 200          │ 32 │ 150    │ 291.13 us   │ 310.53 us   │ 95.50% │
│ l2       │ 256       │ 200          │ 48 │ 10     │ 286.92 us   │ 37.79 us    │ 37.30% │
│ l2       │ 256       │ 200          │ 48 │ 50     │ 286.92 us   │ 117.73 us   │ 80.30% │
│ l2       │ 256       │ 200          │ 48 │ 100    │ 286.92 us   │ 196.01 us   │ 93.80% │
│ l2       │ 256       │ 200          │ 48 │ 150    │ 286.92 us   │ 259.88 us   │ 98.20% │
│ l2       │ 256       │ 200          │ 64 │ 10     │ 285.82 us   │ 50.26 us    │ 42.60% │
│ l2       │ 256       │ 200          │ 64 │ 50     │ 285.82 us   │ 138.83 us   │ 84.00% │
│ l2       │ 256       │ 200          │ 64 │ 100    │ 285.82 us   │ 253.18 us   │ 95.40% │
│ l2       │ 256       │ 200          │ 64 │ 150    │ 285.82 us   │ 316.45 us   │ 98.70% │
│ l2       │ 1024      │ 200          │ 32 │ 10     │ 1395.02 us  │ 158.75 us   │ 23.50% │
│ l2       │ 1024      │ 200          │ 32 │ 50     │ 1395.02 us  │ 564.27 us   │ 60.30% │
│ l2       │ 1024      │ 200          │ 32 │ 100    │ 1395.02 us  │ 919.26 us   │ 79.30% │
│ l2       │ 1024      │ 200          │ 32 │ 150    │ 1395.02 us  │ 1232.40 us  │ 88.20% │
│ l2       │ 1024      │ 200          │ 48 │ 10     │ 1489.91 us  │ 252.91 us   │ 28.50% │
│ l2       │ 1024      │ 200          │ 48 │ 50     │ 1489.91 us  │ 848.13 us   │ 69.40% │
│ l2       │ 1024      │ 200          │ 48 │ 100    │ 1489.91 us  │ 1294.02 us  │ 86.80% │
│ l2       │ 1024      │ 200          │ 48 │ 150    │ 1489.91 us  │ 1680.97 us  │ 94.20% │
│ l2       │ 1024      │ 200          │ 64 │ 10     │ 1412.03 us  │ 273.36 us   │ 33.30% │
│ l2       │ 1024      │ 200          │ 64 │ 50     │ 1412.03 us  │ 899.13 us   │ 75.50% │
│ l2       │ 1024      │ 200          │ 64 │ 100    │ 1412.03 us  │ 1419.61 us  │ 90.10% │
│ l2       │ 1024      │ 200          │ 64 │ 150    │ 1412.03 us  │ 1821.85 us  │ 96.00% │
│ cosine   │ 256       │ 200          │ 32 │ 10     │ 255.22 us   │ 28.66 us    │ 38.60% │
│ cosine   │ 256       │ 200          │ 32 │ 50     │ 255.22 us   │ 85.39 us    │ 75.90% │
│ cosine   │ 256       │ 200          │ 32 │ 100    │ 255.22 us   │ 137.31 us   │ 91.10% │
│ cosine   │ 256       │ 200          │ 32 │ 150    │ 255.22 us   │ 190.87 us   │ 95.30% │
│ cosine   │ 256       │ 200          │ 48 │ 10     │ 259.62 us   │ 57.31 us    │ 46.60% │
│ cosine   │ 256       │ 200          │ 48 │ 50     │ 259.62 us   │ 170.54 us   │ 84.80% │
│ cosine   │ 256       │ 200          │ 48 │ 100    │ 259.62 us   │ 221.11 us   │ 94.80% │
│ cosine   │ 256       │ 200          │ 48 │ 150    │ 259.62 us   │ 239.90 us   │ 97.90% │
│ cosine   │ 256       │ 200          │ 64 │ 10     │ 273.21 us   │ 49.34 us    │ 48.10% │
│ cosine   │ 256       │ 200          │ 64 │ 50     │ 273.21 us   │ 139.07 us   │ 88.00% │
│ cosine   │ 256       │ 200          │ 64 │ 100    │ 273.21 us   │ 242.51 us   │ 96.30% │
│ cosine   │ 256       │ 200          │ 64 │ 150    │ 273.21 us   │ 296.21 us   │ 98.40% │
│ cosine   │ 1024      │ 200          │ 32 │ 10     │ 1192.27 us  │ 146.86 us   │ 27.40% │
│ cosine   │ 1024      │ 200          │ 32 │ 50     │ 1192.27 us  │ 451.61 us   │ 66.10% │
│ cosine   │ 1024      │ 200          │ 32 │ 100    │ 1192.27 us  │ 826.40 us   │ 83.30% │
│ cosine   │ 1024      │ 200          │ 32 │ 150    │ 1192.27 us  │ 1199.33 us  │ 90.00% │
│ cosine   │ 1024      │ 200          │ 48 │ 10     │ 1337.96 us  │ 200.14 us   │ 33.10% │
│ cosine   │ 1024      │ 200          │ 48 │ 50     │ 1337.96 us  │ 654.35 us   │ 72.60% │
│ cosine   │ 1024      │ 200          │ 48 │ 100    │ 1337.96 us  │ 1091.57 us  │ 88.90% │
│ cosine   │ 1024      │ 200          │ 48 │ 150    │ 1337.96 us  │ 1429.51 us  │ 94.50% │
│ cosine   │ 1024      │ 200          │ 64 │ 10     │ 1287.88 us  │ 257.67 us   │ 38.20% │
│ cosine   │ 1024      │ 200          │ 64 │ 50     │ 1287.88 us  │ 767.61 us   │ 77.00% │
│ cosine   │ 1024      │ 200          │ 64 │ 100    │ 1287.88 us  │ 1250.36 us  │ 92.10% │
│ cosine   │ 1024      │ 200          │ 64 │ 150    │ 1287.88 us  │ 1699.57 us  │ 96.50% │
└──────────┴───────────┴──────────────┴────┴────────┴─────────────┴─────────────┴────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same benchmark is also run for &lt;a href="https://github.com/asg017/sqlite-vss" rel="noopener noreferrer"&gt;sqlite-vss&lt;/a&gt; using its default index:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ vector dimension ┃ insert_time(per vector) ┃ search_time(per query) ┃ recall_rate ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ 256              │ 3644.42 us              │ 1483.18 us             │ 100.00%      │
│ 1024             │ 18466.91 us             │ 3412.92 us             │ 100.20%      │
└──────────────────┴─────────────────────────┴────────────────────────┴─────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vectorlite is faster while offering 96% recall rate if properly tuned. Besides, you can also build the index using hnswlib directly and serve the index using vectorlite as vectorlite can be considered a thin wrapper around hnswlib with a SQL API.&lt;/p&gt;

&lt;h1&gt;
  
  
  Quick start
&lt;/h1&gt;

&lt;p&gt;The quickest way to get started is to install vectorlite using python.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Note: vectorlite-py not vectorlite. vectorlite is another project.&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;vectorlite-py apsw numpy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vectorlite's metadata filter feature requires sqlite&amp;gt;=3.38. Python's builtin &lt;code&gt;sqlite&lt;/code&gt; module is usually built with old sqlite versions. So &lt;code&gt;apsw&lt;/code&gt; is used here as the SQLite driver, as it provides bindings to newer SQLite. Vectorlite still works with old SQLite versions if metadata filter support is not required.&lt;br&gt;
Below is a minimal example of using vectorlite.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;vectorlite_py&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;apsw&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Quick start of using vectorlite extension.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;apsw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:memory:&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enable_load_extension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# enable extension loading
&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_extension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectorlite_py&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vectorlite_path&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# load vectorlite
&lt;/span&gt;
&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# check if vectorlite is loaded
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;select vectorlite_info()&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c1"&gt;# Vector distance calculation
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;distance_type&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;l2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cosine&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ip&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;v1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[1, 2, 3]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;v2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[4, 5, 6]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="c1"&gt;# Note vector_from_json can be used to convert a JSON string to a vector
&lt;/span&gt;    &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;select vector_distance(vector_from_json(?), vector_from_json(?), &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;distance_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;fetchone&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;distance_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; distance between &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;v1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; and &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;v2&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# generate some test data
&lt;/span&gt;&lt;span class="n"&gt;DIM&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt; &lt;span class="c1"&gt;# dimension of the vectors
&lt;/span&gt;&lt;span class="n"&gt;NUM_ELEMENTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="c1"&gt;# number of vectors
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;NUM_ELEMENTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DIM&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="c1"&gt;# Only float32 vectors are supported by vectorlite for now
&lt;/span&gt;
&lt;span class="c1"&gt;# Create a virtual table using vectorlite using l2 distance (default distance type) and default HNSW parameters
&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;create virtual table my_table using vectorlite(my_embedding float32[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;DIM&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;], hnsw(max_elements=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;NUM_ELEMENTS&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;))&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Vector distance type can be explicitly set to cosine using:
# cursor.execute(f'create virtual table my_table using vectorlite(my_embedding float32[{DIM}] cosine, hnsw(max_elements={NUM_ELEMENTS}))')
&lt;/span&gt;
&lt;span class="c1"&gt;# Insert the test data into the virtual table. Note that the rowid MUST be explicitly set when inserting vectors and cannot be auto-generated.
# The rowid is used to uniquely identify a vector and serve as a "foreign key" to relate to the vector's metadata.
# Vectorlite takes vectors in raw bytes, so a numpy vector need to be converted to bytes before inserting into the table.
&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;executemany&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;insert into my_table(rowid, my_embedding) values (?, ?)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;tobytes&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NUM_ELEMENTS&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

&lt;span class="c1"&gt;# Query the virtual table to get the vector at rowid 12345. Note the vector needs to be converted back to json using vector_to_json() to be human-readable. 
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;select vector_to_json(my_embedding) from my_table where rowid = 1234&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetchone&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vector at rowid 1234: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Find 10 approximate nearest neighbors of data[0] and there distances from data[0].
# knn_search() is used to tell vectorlite to do a vector search.
# knn_param(V, K, ef) is used to pass the query vector V, the number of nearest neighbors K to find and an optional ef parameter to tune the performance of the search.
# If ef is not specified, ef defaults to 10. For more info on ef, please check https://github.com/nmslib/hnswlib/blob/v0.8.0/ALGO_PARAMS.md
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;select rowid, distance from my_table where knn_search(my_embedding, knn_param(?, 10))&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;tobytes&lt;/span&gt;&lt;span class="p"&gt;()]).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;10 nearest neighbors of row 0 is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Find 10 approximate nearest neighbors of the first embedding in vectors with rowid within [1000, 2000) using metadata(rowid) filtering.
&lt;/span&gt;&lt;span class="n"&gt;rowids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rowid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;rowid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;select rowid, distance from my_table where knn_search(my_embedding, knn_param(?, 10)) and rowid in (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;rowids&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;tobytes&lt;/span&gt;&lt;span class="p"&gt;()]).&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;10 nearest neighbors of row 0 in vectors with rowid within [1000, 2000) is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More examples can be found in &lt;a href="https://github.com/1yefuwang1/vectorlite/tree/v0.1.0/examples" rel="noopener noreferrer"&gt;examples&lt;/a&gt; and &lt;a href="https://github.com/1yefuwang1/vectorlite/tree/v0.1.0/integration_test" rel="noopener noreferrer"&gt;integration_test&lt;/a&gt; folder.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Vectorlite is created with the hope that it could be the go-to vector search solution for SQLite like pgvector for PostgreSQL. &lt;br&gt;
It is still in early stage. Any suggestions are welcome and appreciated.&lt;/p&gt;

</description>
      <category>vectordatabase</category>
      <category>sqlite</category>
      <category>rag</category>
      <category>python</category>
    </item>
  </channel>
</rss>
