<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Qdrant Articles on Qdrant - Vector Database</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/</link><description>Recent content in Qdrant Articles on Qdrant - Vector Database</description><generator>Hugo</generator><language>en-us</language><managingEditor>info@qdrant.tech (Andrey Vasnetsov)</managingEditor><webMaster>info@qdrant.tech (Andrey Vasnetsov)</webMaster><lastBuildDate>Wed, 28 Jan 2026 00:00:00 -0800</lastBuildDate><atom:link href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/index.xml" rel="self" type="application/rss+xml"/><item><title>Distance-based data exploration</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/distance-based-exploration/</link><pubDate>Tue, 11 Mar 2025 12:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/distance-based-exploration/</guid><description>&lt;h2 id="hidden-structure">Hidden Structure&lt;/h2>
&lt;p>When working with large collections of documents, images, or other arrays of unstructured data, it often becomes useful to understand the big picture.
Examining data points individually is not always the best way to grasp the structure of the data.&lt;/p>
&lt;figure>&lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/distance-based-exploration/no-context-data.png"
 alt="Data visualization">&lt;figcaption>
 &lt;p>Datapoints without context, pretty much useless&lt;/p>
 &lt;/figcaption>
&lt;/figure>

&lt;p>As numbers in a table obtain meaning when plotted on a graph, visualising distances (similar/dissimilar) between unstructured data items can reveal hidden structures and patterns.&lt;/p></description></item><item><title>Modern Sparse Neural Retrieval: From Theory to Practice</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/modern-sparse-neural-retrieval/</link><pubDate>Wed, 23 Oct 2024 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/modern-sparse-neural-retrieval/</guid><description>&lt;p>Finding enough time to study all the modern solutions while keeping your production running is rarely feasible.
Dense retrievers, hybrid retrievers, late interaction… How do they work, and where do they fit best?
If only we could compare retrievers as easily as products on Amazon!&lt;/p>
&lt;p>We explored the most popular modern sparse neural retrieval models and broke them down for you.
By the end of this article, you’ll have a clear understanding of the current landscape in sparse neural retrieval and how to navigate through complex, math-heavy research papers with sky-high NDCG scores without getting overwhelmed.&lt;/p></description></item><item><title>Qdrant Summer of Code 2024 - ONNX Cross Encoders in Python</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/cross-encoder-integration-gsoc/</link><pubDate>Mon, 14 Oct 2024 08:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/cross-encoder-integration-gsoc/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hi everyone! I’m Huong (Celine) Hoang, and I’m thrilled to share my experience working at Qdrant this summer as part of their Summer of Code 2024 program. During my internship, I worked on integrating cross-encoders into the FastEmbed library for re-ranking tasks. This enhancement widened the capabilities of the Qdrant ecosystem, enabling developers to build more context-aware search applications, such as question-answering systems, using Qdrant&amp;rsquo;s suite of libraries.&lt;/p>
&lt;p>This project was both technically challenging and rewarding, pushing me to grow my skills in handling large-scale ONNX (Open Neural Network Exchange) model integrations, tokenization, and more. Let me take you through the journey, the lessons learned, and where things are headed next.&lt;/p></description></item><item><title>What is a Vector Database?</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-a-vector-database/</link><pubDate>Wed, 09 Oct 2024 09:29:33 -0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-a-vector-database/</guid><description>&lt;h2 id="an-introduction-to-vector-databases">An Introduction to Vector Databases&lt;/h2>
&lt;p>&lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/what-is-a-vector-database/vector-database-1.jpeg" alt="vector-database-architecture">&lt;/p>
&lt;p>Most of the millions of terabytes of data we generate each day is &lt;strong>unstructured&lt;/strong>. Think of the meal photos you snap, the PDFs shared at work, or the podcasts you save but may never listen to. None of it fits neatly into rows and columns.&lt;/p>
&lt;p>Unstructured data lacks a strict format or schema, making it challenging for conventional databases to manage. Yet, this unstructured data holds immense potential for &lt;strong>AI&lt;/strong>, &lt;strong>machine learning&lt;/strong>, and &lt;strong>modern search engines&lt;/strong>.&lt;/p></description></item><item><title>What is Vector Quantization?</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-vector-quantization/</link><pubDate>Wed, 25 Sep 2024 09:29:33 -0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-vector-quantization/</guid><description>&lt;p>Vector quantization is a data compression technique used to reduce the size of high-dimensional data. Compressing vectors reduces memory usage while maintaining nearly all of the essential information. This method allows for more efficient storage and faster search operations, particularly in large datasets.&lt;/p>
&lt;p>When working with high-dimensional vectors, such as embeddings from providers like OpenAI, a single 1536-dimensional vector requires &lt;strong>6 KB of memory&lt;/strong>.&lt;/p>
&lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/what-is-vector-quantization/vector-size.png" alt="1536-dimensional vector size is 6 KB" width="700">
&lt;p>With 1 million vectors needing around 6 GB of memory, as your dataset grows to multiple &lt;strong>millions of vectors&lt;/strong>, the memory and processing demands increase significantly.&lt;/p></description></item><item><title>Two Approaches to Helping AI Agents Use Your API (And Why You Need Both)</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/skill-md-meets-repl/</link><pubDate>Wed, 28 Jan 2026 00:00:00 -0800</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/skill-md-meets-repl/</guid><description>&lt;p>AI coding agents fail in predictable ways when working with APIs. Two recent approaches from Mintlify and Armin Ronacher attack different failure modes. Understanding both reveals something useful about how agents should interact with developer tools.&lt;/p>
&lt;h2 id="two-failure-modes">Two Failure Modes&lt;/h2>
&lt;p>When an agent writes code against your API, it can fail because:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>It doesn&amp;rsquo;t know what it doesn&amp;rsquo;t know.&lt;/strong> The agent uses a deprecated method, misconfigures a parameter, or violates a constraint that isn&amp;rsquo;t obvious from type signatures. This is the &amp;ldquo;known unknowns&amp;rdquo; problem: things the API maintainer knows but the agent doesn&amp;rsquo;t.&lt;/p></description></item><item><title>Vector Search Resource Optimization Guide</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-search-resource-optimization/</link><pubDate>Sun, 09 Feb 2025 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-search-resource-optimization/</guid><description>&lt;h2 id="whats-in-this-guide">What’s in This Guide?&lt;/h2>
&lt;p>&lt;a href="#storage-disk-vs-ram">&lt;strong>Resource Management Strategies:&lt;/strong>&lt;/a> If you are trying to scale your app on a budget - this is the guide for you. We will show you how to avoid wasting compute resources and get the maximum return on your investment.&lt;/p>
&lt;p>&lt;a href="#configure-indexing-for-faster-searches">&lt;strong>Performance Improvement Tricks:&lt;/strong>&lt;/a> We’ll dive into advanced techniques like indexing, compression, and partitioning. Our tips will help you get better results at scale, while reducing total resource expenditure.&lt;/p></description></item><item><title>A Complete Guide to Filtering in Vector Search</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-search-filtering/</link><pubDate>Tue, 10 Sep 2024 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-search-filtering/</guid><description>&lt;p>Imagine you sell computer hardware. To help shoppers easily find products on your website, you need to have a &lt;strong>user-friendly &lt;a href="https://qdrant.tech" target="_blank" rel="noopener nofollow">search engine&lt;/a>&lt;/strong>.&lt;/p>
&lt;p>&lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/vector-search-filtering/vector-search-ecommerce.png" alt="vector-search-ecommerce">&lt;/p>
&lt;p>If you’re selling computers and have extensive data on laptops, desktops, and accessories, your search feature should guide customers to the exact device they want - or at least a &lt;strong>very similar&lt;/strong> match.&lt;/p>
&lt;p>When storing data in Qdrant, each product is a point, consisting of an &lt;code>id&lt;/code>, a &lt;code>vector&lt;/code> and &lt;code>payload&lt;/code>:&lt;/p></description></item><item><title>Qdrant Internals: Immutable Data Structures</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/immutable-data-structures/</link><pubDate>Tue, 20 Aug 2024 10:45:00 +0200</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/immutable-data-structures/</guid><description>&lt;h2 id="data-structures-101">Data Structures 101&lt;/h2>
&lt;p>Those who took programming courses might remember that there is no such thing as a universal data structure.
Some structures are good at accessing elements by index (like arrays), while others shine in terms of insertion efficiency (like linked lists).&lt;/p>
&lt;figure>&lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/immutable-data-structures/hardware-optimized.png"
 alt="Hardware-optimized data structure" width="80%">&lt;figcaption>
 &lt;p>Hardware-optimized data structure&lt;/p>
 &lt;/figcaption>
&lt;/figure>

&lt;p>However, when we move from theoretical data structures to real-world systems, and particularly in performance-critical areas such as &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/use-cases/">vector search&lt;/a>, things become more complex. &lt;a href="https://en.wikipedia.org/wiki/Big_O_notation" target="_blank" rel="noopener nofollow">Big-O notation&lt;/a> provides a good abstraction, but it doesn’t account for the realities of modern hardware: cache misses, memory layout, disk I/O, and other low-level considerations that influence actual performance.&lt;/p></description></item><item><title>miniCOIL: on the Road to Usable Sparse Neural Retrieval</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/minicoil/</link><pubDate>Tue, 13 May 2025 00:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/minicoil/</guid><description>&lt;p>Have you ever heard of sparse neural retrieval? If so, have you used it in production?&lt;/p>
&lt;p>It&amp;rsquo;s a field with excellent potential &amp;ndash; who wouldn&amp;rsquo;t want to use an approach that combines the strengths of dense and term-based text retrieval? Yet it&amp;rsquo;s not so popular. Is it due to the common curse of &lt;em>“What looks good on paper is not going to work in practice”?&lt;/em>?&lt;/p>
&lt;p>This article describes our path towards sparse neural retrieval &lt;em>as it should be&lt;/em> &amp;ndash; lightweight term-based retrievers capable of distinguishing word meanings.&lt;/p></description></item><item><title>Relevance Feedback in Informational Retrieval</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/search-feedback-loop/</link><pubDate>Thu, 27 Mar 2025 00:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/search-feedback-loop/</guid><description>&lt;blockquote>
&lt;p>A problem well stated is a problem half solved.&lt;/p>
&lt;/blockquote>
&lt;p>This quote applies as much to life as it does to information retrieval.&lt;/p>
&lt;p>With a well-formulated query, retrieving the relevant document becomes trivial.
In reality, however, most users struggle to precisely define what they are searching for.&lt;/p>
&lt;p>While users may struggle to formulate a perfect request — especially in unfamiliar topics — they can easily judge whether a retrieved answer is relevant or not.&lt;/p></description></item><item><title>Built for Vector Search</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dedicated-vector-search/</link><pubDate>Mon, 17 Feb 2025 10:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dedicated-vector-search/</guid><description>&lt;p>Any problem with even a bit of complexity requires a specialized solution. You can use a Swiss Army knife to open a bottle or poke a hole in a cardboard box, but you will need an axe to chop wood — the same goes for software.&lt;/p>
&lt;p>In this article, we will describe the unique challenges vector search poses and why a dedicated solution is the best way to tackle them.&lt;/p></description></item><item><title>Any* Embedding Model Can Become a Late Interaction Model... If You Give It a Chance!</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/late-interaction-models/</link><pubDate>Wed, 14 Aug 2024 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/late-interaction-models/</guid><description>&lt;p>* At least any open-source model, since you need access to its internals.&lt;/p>
&lt;h2 id="you-can-adapt-dense-embedding-models-for-late-interaction">You Can Adapt Dense Embedding Models for Late Interaction&lt;/h2>
&lt;p>Qdrant 1.10 introduced support for multi-vector representations, with late interaction being a prominent example of this model. In essence, both documents and queries are represented by multiple vectors, and identifying the most relevant documents involves calculating a score based on the similarity between the corresponding query and document embeddings. If you&amp;rsquo;re not familiar with this paradigm, our updated &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/hybrid-search/">Hybrid Search&lt;/a> article explains how multi-vector representations can enhance retrieval quality.&lt;/p></description></item><item><title>Optimizing Memory for Bulk Uploads</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/indexing-optimization/</link><pubDate>Thu, 13 Feb 2025 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/indexing-optimization/</guid><description>&lt;h1 id="optimizing-memory-consumption-during-bulk-uploads">Optimizing Memory Consumption During Bulk Uploads&lt;/h1>
&lt;p>Efficient memory management is a constant challenge when you’re dealing with &lt;strong>large-scale vector data&lt;/strong>. In high-volume ingestion scenarios, even seemingly minor configuration choices can significantly impact stability and performance.&lt;/p>
&lt;p>Let’s take a look at the best practices and recommendations to help you optimize memory usage during bulk uploads in Qdrant. We&amp;rsquo;ll cover scenarios with both &lt;strong>dense&lt;/strong> and &lt;strong>sparse&lt;/strong> vectors, helping your deployments remain performant even under high load and avoiding out-of-memory errors.&lt;/p></description></item><item><title>Introducing Gridstore: Qdrant's Custom Key-Value Store</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/gridstore-key-value-storage/</link><pubDate>Wed, 05 Feb 2025 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/gridstore-key-value-storage/</guid><description>&lt;h2 id="why-we-built-our-own-storage-engine">Why We Built Our Own Storage Engine&lt;/h2>
&lt;p>Databases need a place to store and retrieve data. That’s what Qdrant&amp;rsquo;s &lt;a href="https://en.wikipedia.org/wiki/Key%e2%80%93value_database" target="_blank" rel="noopener nofollow">&lt;strong>key-value storage&lt;/strong>&lt;/a> does—it links keys to values.&lt;/p>
&lt;p>When we started building Qdrant, we needed to pick something ready for the task. So we chose &lt;a href="https://rocksdb.org" target="_blank" rel="noopener nofollow">&lt;strong>RocksDB&lt;/strong>&lt;/a> as our embedded key-value store.&lt;/p>
&lt;div style="text-align: center;">
 &lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/gridstore-key-value-storage/rocksdb.jpg" alt="RocksDB" style="width: 50%;">
 &lt;p>It is mature, reliable, and well-documented.&lt;/p>
&lt;/div>
&lt;p>Over time, we ran into issues. Its architecture required compaction (uses &lt;a href="https://en.wikipedia.org/wiki/Log-structured_merge-tree" target="_blank" rel="noopener nofollow">LSMT&lt;/a>), which caused random latency spikes. It handles generic keys, while we only use it for sequential IDs. Having lots of configuration options makes it versatile, but accurately tuning it was a headache. Finally, interoperating with C++ slowed us down (although we will still support it for quite some time 😭).&lt;/p></description></item><item><title>What is Agentic RAG? Building Agents with Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/agentic-rag/</link><pubDate>Fri, 22 Nov 2024 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/agentic-rag/</guid><description>&lt;p>Standard &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-rag-in-ai/">Retrieval Augmented Generation&lt;/a> follows a predictable, linear path: receive
a query, retrieve relevant documents, and generate a response. In many cases that might be enough to solve a particular
problem. In the worst case scenario, your LLM will just decide to not answer the question, because the context does not
provide enough information.&lt;/p>
&lt;p>&lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/agentic-rag/linear-rag.png" alt="Standard, linear RAG pipeline">&lt;/p>
&lt;p>On the other hand, we have agents. These systems are given more freedom to act, and can take multiple non-linear steps
to achieve a certain goal. There isn&amp;rsquo;t a single definition of what an agent is, but in general, it is an application
that uses LLM and usually some tools to communicate with the outside world. LLMs are used as decision-makers which
decide what action to take next. Actions can be anything, but they are usually well-defined and limited to a certain
set of possibilities. One of these actions might be to query a vector database, like Qdrant, to retrieve relevant
documents, if the context is not enough to make a decision. However, RAG is just a single tool in the agent&amp;rsquo;s arsenal.&lt;/p></description></item><item><title>Hybrid Search Revamped - Building with Qdrant's Query API</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/hybrid-search/</link><pubDate>Thu, 25 Jul 2024 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/hybrid-search/</guid><description>&lt;p>It&amp;rsquo;s been over a year since we published the original article on how to build a hybrid
search system with Qdrant. The idea was straightforward: combine the results from different search methods to improve
retrieval quality. Back in 2023, you still needed to use an additional service to bring lexical search
capabilities and combine all the intermediate results. Things have changed since then. Once we introduced support for
sparse vectors, &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/sparse-vectors/">the additional search service became obsolete&lt;/a>, but you were still
required to combine the results from different methods on your end.&lt;/p></description></item><item><title>What is RAG: Understanding Retrieval-Augmented Generation</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-rag-in-ai/</link><pubDate>Tue, 19 Mar 2024 09:29:33 -0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-rag-in-ai/</guid><description>&lt;blockquote>
&lt;p>Retrieval-augmented generation (RAG) integrates external information retrieval into the process of generating responses by Large Language Models (LLMs). It searches a database for information beyond its pre-trained knowledge base, significantly improving the accuracy and relevance of the generated responses.&lt;/p>
&lt;/blockquote>
&lt;p>Language models have exploded on the internet ever since ChatGPT came out, and rightfully so. They can write essays, code entire programs, and even make memes (though we’re still deciding on whether that&amp;rsquo;s a good thing).&lt;/p></description></item><item><title>BM42: New Baseline for Hybrid Search</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/bm42/</link><pubDate>Mon, 01 Jul 2024 12:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/bm42/</guid><description>&lt;aside role="status">
Please note that the benchmark section of this article was updated after the publication due to a mistake in the evaluation script.
BM42 does not outperform BM25 implementation of other vendors.
Please consider BM42 as an experimental approach, which requires further research and development before it can be used in production.
&lt;/aside>
&lt;p>For the last 40 years, BM25 has served as the standard for search engines.
It is a simple yet powerful algorithm that has been used by many search engines, including Google, Bing, and Yahoo.&lt;/p></description></item><item><title>Qdrant 1.8.0: Enhanced Search Capabilities for Better Results</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.8.x/</link><pubDate>Wed, 06 Mar 2024 00:00:00 -0800</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.8.x/</guid><description>&lt;h1 id="unlocking-next-level-search-exploring-qdrant-180s-advanced-search-capabilities">Unlocking Next-Level Search: Exploring Qdrant 1.8.0&amp;rsquo;s Advanced Search Capabilities&lt;/h1>
&lt;p>&lt;a href="https://github.com/qdrant/qdrant/releases/tag/v1.8.0" target="_blank" rel="noopener nofollow">Qdrant 1.8.0 is out!&lt;/a>.
This time around, we have focused on Qdrant&amp;rsquo;s internals. Our goal was to optimize performance so that your existing setup can run faster and save on compute. Here is what we&amp;rsquo;ve been up to:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Faster &lt;a href="https://qdrant.tech/articles/sparse-vectors/" target="_blank" rel="noopener nofollow">sparse vectors&lt;/a>:&lt;/strong> &lt;a href="https://qdrant.tech/articles/hybrid-search/" target="_blank" rel="noopener nofollow">Hybrid search&lt;/a> is up to 16x faster now!&lt;/li>
&lt;li>&lt;strong>CPU resource management:&lt;/strong> You can allocate CPU threads for faster indexing.&lt;/li>
&lt;li>&lt;strong>Better indexing performance:&lt;/strong> We optimized text &lt;a href="https://qdrant.tech/documentation/concepts/indexing/" target="_blank" rel="noopener nofollow">indexing&lt;/a> on the backend.&lt;/li>
&lt;/ul>
&lt;h2 id="faster-search-with-sparse-vectors">Faster search with sparse vectors&lt;/h2>
&lt;p>Search throughput is now up to 16 times faster for sparse vectors. If you are &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/sparse-vectors/">using Qdrant for hybrid search&lt;/a>, this means that you can now handle up to sixteen times as many queries. This improvement comes from extensive backend optimizations aimed at increasing efficiency and capacity.&lt;/p></description></item><item><title>Optimizing RAG Through an Evaluation-Based Methodology</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/rapid-rag-optimization-with-qdrant-and-quotient/</link><pubDate>Wed, 12 Jun 2024 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/rapid-rag-optimization-with-qdrant-and-quotient/</guid><description>&lt;p>In today&amp;rsquo;s fast-paced, information-rich world, AI is revolutionizing knowledge management. The systematic process of capturing, distributing, and effectively using knowledge within an organization is one of the fields in which AI provides exceptional value today.&lt;/p>
&lt;blockquote>
&lt;p>The potential for AI-powered knowledge management increases when leveraging &lt;a href="https://qdrant.tech/rag/rag-evaluation-guide/" target="_blank" rel="noopener nofollow">Retrieval Augmented Generation (RAG)&lt;/a>, a methodology that enables LLMs to access a vast, diverse repository of factual information from knowledge stores, such as vector databases.&lt;/p>
&lt;/blockquote>
&lt;p>This process enhances the accuracy, relevance, and reliability of generated text, thereby mitigating the risk of faulty, incorrect, or nonsensical results sometimes associated with traditional LLMs. This method not only ensures that the answers are contextually relevant but also up-to-date, reflecting the latest insights and data available.&lt;/p></description></item><item><title>Is RAG Dead? The Role of Vector Databases in Vector Search | Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/rag-is-dead/</link><pubDate>Tue, 27 Feb 2024 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/rag-is-dead/</guid><description>&lt;h1 id="is-rag-dead-the-role-of-vector-databases-in-ai-efficiency-and-vector-search">Is RAG Dead? The Role of Vector Databases in AI Efficiency and Vector Search&lt;/h1>
&lt;p>When Anthropic came out with a context window of 100K tokens, they said: “&lt;em>&lt;a href="https://qdrant.tech/solutions/" target="_blank" rel="noopener nofollow">Vector search&lt;/a> is dead. LLMs are getting more accurate and won’t need RAG anymore.&lt;/em>”&lt;/p>
&lt;p>Google’s Gemini 1.5 now offers a context window of 10 million tokens. &lt;a href="https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf" target="_blank" rel="noopener nofollow">Their supporting paper&lt;/a> claims victory over accuracy issues, even when applying Greg Kamradt’s &lt;a href="https://twitter.com/GregKamradt/status/1722386725635580292" target="_blank" rel="noopener nofollow">NIAH methodology&lt;/a>.&lt;/p>
&lt;p>&lt;em>It’s over. &lt;a href="https://qdrant.tech/articles/what-is-rag-in-ai/" target="_blank" rel="noopener nofollow">RAG&lt;/a> (Retrieval Augmented Generation) must be completely obsolete now. Right?&lt;/em>&lt;/p></description></item><item><title>Optimizing OpenAI Embeddings: Enhance Efficiency with Qdrant's Binary Quantization</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/binary-quantization-openai/</link><pubDate>Wed, 21 Feb 2024 13:12:08 -0800</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/binary-quantization-openai/</guid><description>&lt;p>OpenAI Ada-003 embeddings are a powerful tool for natural language processing (NLP). However, the size of the embeddings are a challenge, especially with real-time search and retrieval. In this article, we explore how you can use Qdrant&amp;rsquo;s Binary Quantization to enhance the performance and efficiency of OpenAI embeddings.&lt;/p>
&lt;p>In this post, we discuss:&lt;/p>
&lt;ul>
&lt;li>The significance of OpenAI embeddings and real-world challenges.&lt;/li>
&lt;li>Qdrant&amp;rsquo;s Binary Quantization, and how it can improve the performance of OpenAI embeddings&lt;/li>
&lt;li>Results of an experiment that highlights improvements in search efficiency and accuracy&lt;/li>
&lt;li>Implications of these findings for real-world applications&lt;/li>
&lt;li>Best practices for leveraging Binary Quantization to enhance OpenAI embeddings&lt;/li>
&lt;/ul>
&lt;p>If you&amp;rsquo;re new to Binary Quantization, consider reading our article which walks you through the concept and &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/binary-quantization/">how to use it with Qdrant&lt;/a>&lt;/p></description></item><item><title>How to Implement Multitenancy and Custom Sharding in Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/multitenancy/</link><pubDate>Tue, 06 Feb 2024 13:21:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/multitenancy/</guid><description>&lt;h1 id="scaling-your-machine-learning-setup-the-power-of-multitenancy-and-custom-sharding-in-qdrant">Scaling Your Machine Learning Setup: The Power of Multitenancy and Custom Sharding in Qdrant&lt;/h1>
&lt;p>We are seeing the topics of &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/guides/multiple-partitions/">multitenancy&lt;/a> and &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/guides/distributed_deployment/#sharding">distributed deployment&lt;/a> pop-up daily on our &lt;a href="https://qdrant.to/discord" target="_blank" rel="noopener nofollow">Discord support channel&lt;/a>. This tells us that many of you are looking to scale Qdrant along with the rest of your machine learning setup.&lt;/p>
&lt;p>Whether you are building a bank fraud-detection system, &lt;a href="https://qdrant.tech/articles/what-is-rag-in-ai/" target="_blank" rel="noopener nofollow">RAG&lt;/a> for e-commerce, or services for the federal government - you will need to leverage a multitenant architecture to scale your product.
In the world of SaaS and enterprise apps, this setup is the norm. It will considerably increase your application&amp;rsquo;s performance and lower your hosting costs.&lt;/p></description></item><item><title> Data Privacy with Qdrant: Implementing Role-Based Access Control (RBAC)</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/data-privacy/</link><pubDate>Tue, 18 Jun 2024 08:00:00 -0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/data-privacy/</guid><description>&lt;p>Data stored in vector databases is often proprietary to the enterprise and may include sensitive information like customer records, legal contracts, electronic health records (EHR), financial data, and intellectual property. Moreover, strong security measures become critical to safeguarding this data. If the data stored in a vector database is not secured, it may open a vulnerability known as &amp;ldquo;&lt;a href="https://arxiv.org/abs/2004.00053" target="_blank" rel="noopener nofollow">embedding inversion attack&lt;/a>,&amp;rdquo; where malicious actors could potentially &lt;a href="https://arxiv.org/pdf/2305.03010" target="_blank" rel="noopener nofollow">reconstruct the original data from the embeddings&lt;/a> themselves.&lt;/p></description></item><item><title>Discovery needs context</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/discovery-search/</link><pubDate>Wed, 31 Jan 2024 08:00:00 -0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/discovery-search/</guid><description>&lt;h1 id="discovery-needs-context">Discovery needs context&lt;/h1>
&lt;p>When Christopher Columbus and his crew sailed to cross the Atlantic Ocean, they were not looking for the Americas. They were looking for a new route to India because they were convinced that the Earth was round. They didn&amp;rsquo;t know anything about a new continent, but since they were going west, they stumbled upon it.&lt;/p>
&lt;p>They couldn&amp;rsquo;t reach their &lt;em>target&lt;/em>, because the geography didn&amp;rsquo;t let them, but once they realized it wasn&amp;rsquo;t India, they claimed it a new &amp;ldquo;discovery&amp;rdquo; for their crown. If we consider that sailors need water to sail, then we can establish a &lt;em>context&lt;/em> which is positive in the water, and negative on land. Once the sailor&amp;rsquo;s search was stopped by the land, they could not go any further, and a new route was found. Let&amp;rsquo;s keep these concepts of &lt;em>target&lt;/em> and &lt;em>context&lt;/em> in mind as we explore the new functionality of Qdrant: &lt;strong>Discovery search&lt;/strong>.&lt;/p></description></item><item><title>What are Vector Embeddings? - Revolutionize Your Search Experience</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-are-embeddings/</link><pubDate>Tue, 06 Feb 2024 15:29:33 -0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-are-embeddings/</guid><description>&lt;blockquote>
&lt;p>&lt;strong>Embeddings&lt;/strong> are numerical machine learning representations of the semantic of the input data. They capture the meaning of complex, high-dimensional data, like text, images, or audio, into vectors. Enabling algorithms to process and analyze the data more efficiently.&lt;/p>
&lt;/blockquote>
&lt;p>You know when you’re scrolling through your social media feeds and the content just feels incredibly tailored to you? There&amp;rsquo;s the news you care about, followed by a perfect tutorial with your favorite tech stack, and then a meme that makes you laugh so hard you snort.&lt;/p></description></item><item><title>What is a Sparse Vector? How to Achieve Vector-based Hybrid Search</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/sparse-vectors/</link><pubDate>Sat, 09 Dec 2023 13:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/sparse-vectors/</guid><description>&lt;p>Think of a library with a vast index card system. Each index card only has a few keywords marked out (sparse vector) of a large possible set for each book (document). This is what sparse vectors enable for text.&lt;/p>
&lt;h2 id="what-are-sparse-and-dense-vectors">What are sparse and dense vectors?&lt;/h2>
&lt;p>Sparse vectors are like the Marie Kondo of data—keeping only what sparks joy (or relevance, in this case).&lt;/p>
&lt;p>Consider a simplified example of 2 documents, each with 200 words. A dense vector would have several hundred non-zero values, whereas a sparse vector could have, much fewer, say only 20 non-zero values.&lt;/p></description></item><item><title>Qdrant 1.7.0 has just landed!</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.7.x/</link><pubDate>Sun, 10 Dec 2023 10:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.7.x/</guid><description>&lt;p>Please welcome the long-awaited &lt;a href="https://github.com/qdrant/qdrant/releases/tag/v1.7.0" target="_blank" rel="noopener nofollow">Qdrant 1.7.0 release&lt;/a>. Except for a handful of minor fixes and improvements, this release brings some cool brand-new features that we are excited to share!
The latest version of your favorite vector search engine finally supports &lt;strong>sparse vectors&lt;/strong>. That&amp;rsquo;s the feature many of you requested, so why should we ignore it?
We also decided to continue our journey with &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-similarity-beyond-search/">vector similarity beyond search&lt;/a>. The new Discovery API covers some utterly new use cases. We&amp;rsquo;re more than excited to see what you will build with it!
But there is more to it! Check out what&amp;rsquo;s new in &lt;strong>Qdrant 1.7.0&lt;/strong>!&lt;/p></description></item><item><title>Deliver Better Recommendations with Qdrant’s new API</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/new-recommendation-api/</link><pubDate>Wed, 25 Oct 2023 09:46:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/new-recommendation-api/</guid><description>&lt;p>The most popular use case for vector search engines, such as Qdrant, is Semantic search with a single query vector. Given the
query, we can vectorize (embed) it and find the closest points in the index. But &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-similarity-beyond-search/">Vector Similarity beyond Search&lt;/a>
does exist, and recommendation systems are a great example. Recommendations might be seen as a multi-aim search, where we want
to find items close to positive and far from negative examples. This use of vector databases has many applications, including
recommendation systems for e-commerce, content, or even dating apps.&lt;/p></description></item><item><title>Vector Search as a dedicated service</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dedicated-service/</link><pubDate>Thu, 30 Nov 2023 10:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dedicated-service/</guid><description>&lt;p>Ever since the data science community discovered that vector search significantly improves LLM answers,
various vendors and enthusiasts have been arguing over the proper solutions to store embeddings.&lt;/p>
&lt;p>Some say storing them in a specialized engine (aka vector database) is better. Others say that it&amp;rsquo;s enough to use plugins for existing databases.&lt;/p>
&lt;p>Here are &lt;a href="https://nextword.substack.com/p/vector-database-is-not-a-separate" target="_blank" rel="noopener nofollow">just&lt;/a> a &lt;a href="https://stackoverflow.blog/2023/09/20/do-you-need-a-specialized-vector-database-to-implement-vector-search-well/" target="_blank" rel="noopener nofollow">few&lt;/a> of &lt;a href="https://www.singlestore.com/blog/why-your-vector-database-should-not-be-a-vector-database/" target="_blank" rel="noopener nofollow">them&lt;/a>.&lt;/p>
&lt;p>This article presents our vision and arguments on the topic .
We will:&lt;/p></description></item><item><title>FastEmbed: Qdrant's Efficient Python Library for Embedding Generation</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/fastembed/</link><pubDate>Wed, 18 Oct 2023 10:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/fastembed/</guid><description>&lt;p>Data Science and Machine Learning practitioners often find themselves navigating through a labyrinth of models, libraries, and frameworks. Which model to choose, what embedding size, and how to approach tokenizing, are just some questions you are faced with when starting your work. We understood how many data scientists wanted an easier and more intuitive means to do their embedding work. This is why we built FastEmbed, a Python library engineered for speed, efficiency, and usability. We have created easy to use default workflows, handling the 80% use cases in NLP embedding.&lt;/p></description></item><item><title>Google Summer of Code 2023 - Polygon Geo Filter for Qdrant Vector Database</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/geo-polygon-filter-gsoc/</link><pubDate>Thu, 12 Oct 2023 08:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/geo-polygon-filter-gsoc/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Greetings, I&amp;rsquo;m Zein Wen, and I was a Google Summer of Code 2023 participant at Qdrant. I got to work with an amazing mentor, Arnaud Gourlay, on enhancing the Qdrant Geo Polygon Filter. This new feature allows users to refine their query results using polygons. As the latest addition to the Geo Filter family of radius and rectangle filters, this enhancement promises greater flexibility in querying geo data, unlocking interesting new use cases.&lt;/p></description></item><item><title>Binary Quantization - Vector Search, 40x Faster</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/binary-quantization/</link><pubDate>Mon, 18 Sep 2023 13:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/binary-quantization/</guid><description>&lt;h1 id="optimizing-high-dimensional-vectors-with-binary-quantization">Optimizing High-Dimensional Vectors with Binary Quantization&lt;/h1>
&lt;p>Qdrant is built to handle typical scaling challenges: high throughput, low latency and efficient indexing. &lt;strong>Binary quantization (BQ)&lt;/strong> is our latest attempt to give our customers the edge they need to scale efficiently. This feature is particularly excellent for collections with large vector lengths and a large number of points.&lt;/p>
&lt;p>Our results are dramatic: Using BQ will reduce your memory consumption and improve retrieval speeds by up to 40x.&lt;/p></description></item><item><title>Food Discovery Demo</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/food-discovery-demo/</link><pubDate>Tue, 05 Sep 2023 11:32:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/food-discovery-demo/</guid><description>&lt;p>Not every search journey begins with a specific destination in mind. Sometimes, you just want to explore and see what’s out there and what you might like.
This is especially true when it comes to food. You might be craving something sweet, but you don’t know what. You might be also looking for a new dish to try,
and you just want to see the options available. In these cases, it&amp;rsquo;s impossible to express your needs in a textual query, as the thing you are looking for is not
yet defined. Qdrant&amp;rsquo;s semantic search for images is useful when you have a hard time expressing your tastes in words.&lt;/p></description></item><item><title>Google Summer of Code 2023 - Web UI for Visualization and Exploration</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/web-ui-gsoc/</link><pubDate>Mon, 28 Aug 2023 08:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/web-ui-gsoc/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hello everyone! My name is Kartik Gupta, and I am thrilled to share my coding journey as part of the Google Summer of Code 2023 program. This summer, I had the incredible opportunity to work on an exciting project titled &amp;ldquo;Web UI for Visualization and Exploration&amp;rdquo; for Qdrant, a vector search engine. In this article, I will take you through my experience, challenges, and achievements during this enriching coding journey.&lt;/p></description></item><item><title>Qdrant Summer of Code 2024 - WASM based Dimension Reduction</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dimension-reduction-qsoc/</link><pubDate>Sat, 31 Aug 2024 10:39:48 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dimension-reduction-qsoc/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Hello, everyone! I&amp;rsquo;m Jishan Bhattacharya, and I had the incredible opportunity to intern at Qdrant this summer as part of the Qdrant Summer of Code 2024. Under the mentorship of &lt;a href="https://www.linkedin.com/in/andrey-vasnetsov-75268897/" target="_blank" rel="noopener nofollow">Andrey Vasnetsov&lt;/a>, I dived into the world of performance optimization, focusing on enhancing vector visualization using WebAssembly (WASM). In this article, I&amp;rsquo;ll share the insights, challenges, and accomplishments from my journey — one filled with learning, experimentation, and plenty of coding adventures.&lt;/p></description></item><item><title>Semantic Search As You Type</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/search-as-you-type/</link><pubDate>Mon, 14 Aug 2023 00:00:00 +0100</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/search-as-you-type/</guid><description>&lt;p>Qdrant is one of the fastest vector search engines out there, so while looking for a demo to show off, we came upon the idea to do a search-as-you-type box with a fully semantic search backend. Now we already have a semantic/keyword hybrid search on our website. But that one is written in Python, which incurs some overhead for the interpreter. Naturally, I wanted to see how fast I could go using Rust.&lt;/p></description></item><item><title>Vector Similarity: Going Beyond Full-Text Search | Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-similarity-beyond-search/</link><pubDate>Tue, 08 Aug 2023 08:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-similarity-beyond-search/</guid><description>&lt;h1 id="vector-similarity-unleashing-data-insights-beyond-traditional-search">Vector Similarity: Unleashing Data Insights Beyond Traditional Search&lt;/h1>
&lt;p>When making use of unstructured data, there are traditional go-to solutions that are well-known for developers:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Full-text search&lt;/strong> when you need to find documents that contain a particular word or phrase.&lt;/li>
&lt;li>&lt;strong>&lt;a href="https://qdrant.tech/documentation/overview/vector-search/" target="_blank" rel="noopener nofollow">Vector search&lt;/a>&lt;/strong> when you need to find documents that are semantically similar to a given query.&lt;/li>
&lt;/ul>
&lt;p>Sometimes people mix those two approaches, so it might look like the vector similarity is just an extension of full-text search. However, in this article, we will explore some promising new techniques that can be used to expand the use-case of unstructured data and demonstrate that vector similarity creates its own stack of data exploration tools.&lt;/p></description></item><item><title>Serverless Semantic Search</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/serverless/</link><pubDate>Wed, 12 Jul 2023 10:00:00 +0100</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/serverless/</guid><description>&lt;p>Do you want to insert a semantic search function into your website or online app? Now you can do so - without spending any money! In this example, you will learn how to create a free prototype search engine for your own non-commercial purposes.&lt;/p>
&lt;h2 id="ingredients">Ingredients&lt;/h2>
&lt;ul>
&lt;li>A &lt;a href="https://rust-lang.org" target="_blank" rel="noopener nofollow">Rust&lt;/a> toolchain&lt;/li>
&lt;li>&lt;a href="https://cargo-lambda.info" target="_blank" rel="noopener nofollow">cargo lambda&lt;/a> (install via package manager, &lt;a href="https://github.com/cargo-lambda/cargo-lambda/releases" target="_blank" rel="noopener nofollow">download&lt;/a> binary or &lt;code>cargo install cargo-lambda&lt;/code>)&lt;/li>
&lt;li>The &lt;a href="https://aws.amazon.com/cli" target="_blank" rel="noopener nofollow">AWS CLI&lt;/a>&lt;/li>
&lt;li>Qdrant instance (&lt;a href="https://cloud.qdrant.io" target="_blank" rel="noopener nofollow">free tier&lt;/a> available)&lt;/li>
&lt;li>An embedding provider service of your choice (see our &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/embeddings/">Embeddings docs&lt;/a>. You may be able to get credits from &lt;a href="https://aigrant.org" target="_blank" rel="noopener nofollow">AI Grant&lt;/a>, also Cohere has a &lt;a href="https://cohere.com/pricing" target="_blank" rel="noopener nofollow">rate-limited non-commercial free tier&lt;/a>)&lt;/li>
&lt;li>AWS Lambda account (12-month free tier available)&lt;/li>
&lt;/ul>
&lt;h2 id="what-youre-going-to-build">What you&amp;rsquo;re going to build&lt;/h2>
&lt;p>You&amp;rsquo;ll combine the embedding provider and the Qdrant instance to a neat semantic search, calling both services from a small Lambda function.&lt;/p></description></item><item><title>Introducing Qdrant 1.3.0</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.3.x/</link><pubDate>Mon, 26 Jun 2023 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.3.x/</guid><description>&lt;p>A brand-new &lt;a href="https://github.com/qdrant/qdrant/releases/tag/v1.3.0" target="_blank" rel="noopener nofollow">Qdrant 1.3.0 release&lt;/a> comes packed with a plethora of new features, performance improvements and bux fixes:&lt;/p>
&lt;ol>
&lt;li>Asynchronous I/O interface: Reduce overhead by managing I/O operations asynchronously, thus minimizing context switches.&lt;/li>
&lt;li>Oversampling for Quantization: Improve the accuracy and performance of your queries while using Scalar or Product Quantization.&lt;/li>
&lt;li>Grouping API lookup: Storage optimization method that lets you look for points in another collection using group ids.&lt;/li>
&lt;li>Qdrant Web UI: A convenient dashboard to help you manage data stored in Qdrant.&lt;/li>
&lt;li>Temp directory for Snapshots: Set a separate storage directory for temporary snapshots on a faster disk.&lt;/li>
&lt;li>Other important changes&lt;/li>
&lt;/ol>
&lt;p>Your feedback is valuable to us, and are always tying to include some of your feature requests into our roadmap. Join &lt;a href="https://qdrant.to/discord" target="_blank" rel="noopener nofollow">our Discord community&lt;/a> and help us build Qdrant!.&lt;/p></description></item><item><title>Qdrant under the hood: io_uring</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/io_uring/</link><pubDate>Wed, 21 Jun 2023 09:45:00 +0200</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/io_uring/</guid><description>&lt;p>With Qdrant &lt;a href="https://github.com/qdrant/qdrant/releases/tag/v1.3.0" target="_blank" rel="noopener nofollow">version 1.3.0&lt;/a> we
introduce the alternative io_uring based &lt;em>async uring&lt;/em> storage backend on
Linux-based systems. Since its introduction, io_uring has been known to improve
async throughput wherever the OS syscall overhead gets too high, which tends to
occur in situations where software becomes &lt;em>IO bound&lt;/em> (that is, mostly waiting
on disk).&lt;/p>
&lt;h2 id="inputoutput">Input+Output&lt;/h2>
&lt;p>Around the mid-90s, the internet took off. The first servers used a process-
per-request setup, which was good for serving hundreds if not thousands of
concurrent request. The POSIX Input + Output (IO) was modeled in a strictly
synchronous way. The overhead of starting a new process for each request made
this model unsustainable. So servers started forgoing process separation, opting
for the thread-per-request model. But even that ran into limitations.&lt;/p></description></item><item><title>Product Quantization in Vector Search | Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/product-quantization/</link><pubDate>Tue, 30 May 2023 09:45:00 +0200</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/product-quantization/</guid><description>&lt;h1 id="product-quantization-demystified-streamlining-efficiency-in-data-management">Product Quantization Demystified: Streamlining Efficiency in Data Management&lt;/h1>
&lt;p>Qdrant 1.1.0 brought the support of &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/scalar-quantization/">Scalar Quantization&lt;/a>,
a technique of reducing the memory footprint by even four times, by using &lt;code>int8&lt;/code> to represent
the values that would be normally represented by &lt;code>float32&lt;/code>.&lt;/p>
&lt;p>The memory usage in &lt;a href="https://qdrant.tech/solutions/" target="_blank" rel="noopener nofollow">vector search&lt;/a> might be reduced even further! Please welcome &lt;strong>Product
Quantization&lt;/strong>, a brand-new feature of Qdrant 1.2.0!&lt;/p>
&lt;h2 id="what-is-product-quantization">What is Product Quantization?&lt;/h2>
&lt;p>Product Quantization converts floating-point numbers into integers like every other quantization
method. However, the process is slightly more complicated than &lt;a href="https://qdrant.tech/articles/scalar-quantization/" target="_blank" rel="noopener nofollow">Scalar Quantization&lt;/a> and is more customizable, so you can find the sweet spot between memory usage and search precision. This article
covers all the steps required to perform Product Quantization and the way it&amp;rsquo;s implemented in Qdrant.&lt;/p></description></item><item><title>Scalar Quantization: Background, Practices &amp; More | Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/scalar-quantization/</link><pubDate>Mon, 27 Mar 2023 10:45:00 +0100</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/scalar-quantization/</guid><description>&lt;h1 id="efficiency-unleashed-the-power-of-scalar-quantization">Efficiency Unleashed: The Power of Scalar Quantization&lt;/h1>
&lt;p>High-dimensional vector embeddings can be memory-intensive, especially when working with
large datasets consisting of millions of vectors. Memory footprint really starts being
a concern when we scale things up. A simple choice of the data type used to store a single
number impacts even billions of numbers and can drive the memory requirements crazy. The
higher the precision of your type, the more accurately you can represent the numbers.
The more accurate your vectors, the more precise is the distance calculation. But the
advantages stop paying off when you need to order more and more memory.&lt;/p></description></item><item><title>On Unstructured Data, Vector Databases, New AI Age, and Our Seed Round.</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/seed-round/</link><pubDate>Wed, 19 Apr 2023 00:42:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/seed-round/</guid><description>&lt;blockquote>
&lt;p>Vector databases are here to stay. The New Age of AI is powered by vector embeddings, and vector databases are a foundational part of the stack. At Qdrant, we are working on cutting-edge open-source vector similarity search solutions to power fantastic AI applications with the best possible performance and excellent developer experience.&lt;/p>
&lt;p>Our 7.5M seed funding – led by &lt;a href="https://www.unusual.vc/" target="_blank" rel="noopener nofollow">Unusual Ventures&lt;/a>, awesome angels, and existing investors – will help us bring these innovations to engineers and empower them to make the most of their unstructured data and the awesome power of LLMs at any scale.&lt;/p></description></item><item><title>Using LangChain for Question Answering with Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/langchain-integration/</link><pubDate>Tue, 31 Jan 2023 10:53:20 +0100</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/langchain-integration/</guid><description>&lt;h1 id="streamlining-question-answering-simplifying-integration-with-langchain-and-qdrant">Streamlining Question Answering: Simplifying Integration with LangChain and Qdrant&lt;/h1>
&lt;p>Building applications with Large Language Models doesn&amp;rsquo;t have to be complicated. A lot has been going on recently to simplify the development,
so you can utilize already pre-trained models and support even complex pipelines with a few lines of code. &lt;a href="https://langchain.readthedocs.io" target="_blank" rel="noopener nofollow">LangChain&lt;/a>
provides unified interfaces to different libraries, so you can avoid writing boilerplate code and focus on the value you want to bring.&lt;/p></description></item><item><title>Minimal RAM you need to serve a million vectors</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/memory-consumption/</link><pubDate>Wed, 07 Dec 2022 10:18:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/memory-consumption/</guid><description>&lt;!--
1. How people usually measure memory and why it might be misleading
2. How to properly measure memory
3. Try different configurations of Qdrant and see how they affect the memory consumption and search speed
4. Conclusion
-->
&lt;!--
Introduction:

1. We are used to measure memory consumption by looking into `htop`. But it could be misleading.
2. There are multiple reasons why it is wrong:
 1. Process may allocate memory, but not use it.
 2. Process may not free deallocated memory.
 3. Process might be forked and memory is shared between processes.
 3. Process may use disk cache.
3. As a result, if you see `10GB` memory consumption in `htop`, it doesn't mean that your process actually needs `10GB` of RAM to work.
-->
&lt;p>When it comes to measuring the memory consumption of our processes, we often rely on tools such as &lt;code>htop&lt;/code> to give us an indication of how much RAM is being used. However, this method can be misleading and doesn&amp;rsquo;t always accurately reflect the true memory usage of a process.&lt;/p></description></item><item><title>Question Answering as a Service with Cohere and Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qa-with-cohere-and-qdrant/</link><pubDate>Tue, 29 Nov 2022 15:45:00 +0100</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qa-with-cohere-and-qdrant/</guid><description>&lt;p>Bi-encoders are probably the most efficient way of setting up a semantic Question Answering system.
This architecture relies on the same neural model that creates vector embeddings for both questions and answers.
The assumption is, both question and answer should have representations close to each other in the latent space.
It should be like that because they should both describe the same semantic concept. That doesn&amp;rsquo;t apply
to answers like &amp;ldquo;Yes&amp;rdquo; or &amp;ldquo;No&amp;rdquo; though, but standard FAQ-like problems are a bit easier as there is typically
an overlap between both texts. Not necessarily in terms of wording, but in their semantics.&lt;/p></description></item><item><title>Introducing Qdrant 1.2.x</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.2.x/</link><pubDate>Wed, 24 May 2023 10:45:00 +0200</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-1.2.x/</guid><description>&lt;p>A brand-new Qdrant 1.2 release comes packed with a plethora of new features, some of which
were highly requested by our users. If you want to shape the development of the Qdrant vector
database, please &lt;a href="https://qdrant.to/discord" target="_blank" rel="noopener nofollow">join our Discord community&lt;/a> and let us know
how you use it!&lt;/p>
&lt;h2 id="new-features">New features&lt;/h2>
&lt;p>As usual, a minor version update of Qdrant brings some interesting new features. We love to see your
feedback, and we tried to include the features most requested by our community.&lt;/p></description></item><item><title>Finding errors in datasets with Similarity Search</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dataset-quality/</link><pubDate>Mon, 18 Jul 2022 10:18:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/dataset-quality/</guid><description>&lt;p>Nowadays, people create a huge number of applications of various types and solve problems in different areas.
Despite such diversity, they have something in common - they need to process data.
Real-world data is a living structure, it grows day by day, changes a lot and becomes harder to work with.&lt;/p>
&lt;p>In some cases, you need to categorize or label your data, which can be a tough problem given its scale.
The process of splitting or labelling is error-prone and these errors can be very costly.
Imagine that you failed to achieve the desired quality of the model due to inaccurate labels.
Worse, your users are faced with a lot of irrelevant items, unable to find what they need and getting annoyed by it.
Thus, you get poor retention, and it directly impacts company revenue.
It is really important to avoid such errors in your data.&lt;/p></description></item><item><title>Q&amp;A with Similarity Learning</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/faq-question-answering/</link><pubDate>Tue, 28 Jun 2022 08:57:07 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/faq-question-answering/</guid><description>&lt;h1 id="question-answering-system-with-similarity-learning-and-quaterion">Question-answering system with Similarity Learning and Quaterion&lt;/h1>
&lt;p>Many problems in modern machine learning are approached as classification tasks.
Some are the classification tasks by design, but others are artificially transformed into such.
And when you try to apply an approach, which does not naturally fit your problem, you risk coming up with over-complicated or bulky solutions.
In some cases, you would even get worse performance.&lt;/p>
&lt;p>Imagine that you got a new task and decided to solve it with a good old classification approach.
Firstly, you will need labeled data.
If it came on a plate with the task, you&amp;rsquo;re lucky, but if it didn&amp;rsquo;t, you might need to label it manually.
And I guess you are already familiar with how painful it might be.&lt;/p></description></item><item><title>Why Rust?</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/why-rust/</link><pubDate>Thu, 11 May 2023 10:00:00 +0100</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/why-rust/</guid><description>&lt;h1 id="building-qdrant-in-rust">Building Qdrant in Rust&lt;/h1>
&lt;p>Looking at the &lt;a href="https://github.com/qdrant/qdrant" target="_blank" rel="noopener nofollow">github repository&lt;/a>, you can see that Qdrant is built in &lt;a href="https://rust-lang.org" target="_blank" rel="noopener nofollow">Rust&lt;/a>. Other offerings may be written in C++, Go, Java or even Python. So why does Qdrant chose Rust? Our founder Andrey had built the first prototype in C++, but didn’t trust his command of the language to scale to a production system (to be frank, he likened it to cutting his leg off). He was well versed in Java and Scala and also knew some Python. However, he considered neither a good fit:&lt;/p></description></item><item><title>Layer Recycling and Fine-tuning Efficiency</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/embedding-recycler/</link><pubDate>Tue, 23 Aug 2022 13:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/embedding-recycler/</guid><description>&lt;p>A recent &lt;a href="https://arxiv.org/abs/2207.04993" target="_blank" rel="noopener nofollow">paper&lt;/a>
by Allen AI has attracted attention in the NLP community as they cache the output of a certain intermediate layer
in the training and inference phases to achieve a speedup of ~83%
with a negligible loss in model performance.
This technique is quite similar to &lt;a href="https://quaterion.qdrant.tech/tutorials/cache_tutorial.html" target="_blank" rel="noopener nofollow">the caching mechanism in Quaterion&lt;/a>,
but the latter is intended for any data modalities while the former focuses only on language models
despite presenting important insights from their experiments.
In this post, I will share our findings combined with those,
hoping to provide the community with a wider perspective on layer recycling.&lt;/p></description></item><item><title>Fine Tuning Similar Cars Search</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/cars-recognition/</link><pubDate>Tue, 28 Jun 2022 13:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/cars-recognition/</guid><description>&lt;p>Supervised classification is one of the most widely used training objectives in machine learning,
but not every task can be defined as such. For example,&lt;/p>
&lt;ol>
&lt;li>Your classes may change quickly —e.g., new classes may be added over time,&lt;/li>
&lt;li>You may not have samples from every possible category,&lt;/li>
&lt;li>It may be impossible to enumerate all the possible classes during the training time,&lt;/li>
&lt;li>You may have an essentially different task, e.g., search or retrieval.&lt;/li>
&lt;/ol>
&lt;p>All such problems may be efficiently solved with similarity learning.&lt;/p></description></item><item><title>Metric Learning Tips &amp; Tricks</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/metric-learning-tips/</link><pubDate>Sat, 15 May 2021 10:18:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/metric-learning-tips/</guid><description>&lt;h2 id="how-to-train-object-matching-model-with-no-labeled-data-and-use-it-in-production">How to train object matching model with no labeled data and use it in production&lt;/h2>
&lt;p>Currently, most machine-learning-related business cases are solved as a classification problems.
Classification algorithms are so well studied in practice that even if the original problem is not directly a classification task, it is usually decomposed or approximately converted into one.&lt;/p>
&lt;p>However, despite its simplicity, the classification task has requirements that could complicate its production integration and scaling.
E.g. it requires a fixed number of classes, where each class should have a sufficient number of training samples.&lt;/p></description></item><item><title>Metric Learning for Anomaly Detection</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/detecting-coffee-anomalies/</link><pubDate>Wed, 04 May 2022 13:00:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/detecting-coffee-anomalies/</guid><description>&lt;p>Anomaly detection is a thirsting yet challenging task that has numerous use cases across various industries.
The complexity results mainly from the fact that the task is data-scarce by definition.&lt;/p>
&lt;p>Similarly, anomalies are, again by definition, subject to frequent change, and they may take unexpected forms.
For that reason, supervised classification-based approaches are:&lt;/p>
&lt;ul>
&lt;li>Data-hungry - requiring quite a number of labeled data;&lt;/li>
&lt;li>Expensive - data labeling is an expensive task itself;&lt;/li>
&lt;li>Time-consuming - you would try to obtain what is necessarily scarce;&lt;/li>
&lt;li>Hard to maintain - you would need to re-train the model repeatedly in response to changes in the data distribution.&lt;/li>
&lt;/ul>
&lt;p>These are not desirable features if you want to put your model into production in a rapidly-changing environment.
And, despite all the mentioned difficulties, they do not necessarily offer superior performance compared to the alternatives.
In this post, we will detail the lessons learned from such a use case.&lt;/p></description></item><item><title>Triplet Loss - Advanced Intro</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/triplet-loss/</link><pubDate>Thu, 24 Mar 2022 15:12:00 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/triplet-loss/</guid><description>&lt;h2 id="what-is-triplet-loss">What is Triplet Loss?&lt;/h2>
&lt;p>Triplet Loss was first introduced in &lt;a href="https://arxiv.org/abs/1503.03832" target="_blank" rel="noopener nofollow">FaceNet: A Unified Embedding for Face Recognition and Clustering&lt;/a> in 2015,
and it has been one of the most popular loss functions for supervised similarity or metric learning ever since.
In its simplest explanation, Triplet Loss encourages that dissimilar pairs be distant from any similar pairs by at least a certain margin value.
Mathematically, the loss value can be calculated as
$L=max(d(a,p) - d(a,n) + m, 0)$, where:&lt;/p></description></item><item><title>Neural Search 101: A Complete Guide and Step-by-Step Tutorial</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/neural-search-tutorial/</link><pubDate>Thu, 10 Jun 2021 10:18:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/neural-search-tutorial/</guid><description>&lt;h1 id="neural-search-101-a-comprehensive-guide-and-step-by-step-tutorial">Neural Search 101: A Comprehensive Guide and Step-by-Step Tutorial&lt;/h1>
&lt;p>Information retrieval technology is one of the main technologies that enabled the modern Internet to exist.
These days, search technology is the heart of a variety of applications.
From web-pages search to product recommendations.
For many years, this technology didn&amp;rsquo;t get much change until neural networks came into play.&lt;/p>
&lt;p>In this guide we are going to find answers to these questions:&lt;/p></description></item><item><title>Filterable HNSW</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/filterable-hnsw/</link><pubDate>Sun, 24 Nov 2019 22:44:08 +0300</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/filterable-hnsw/</guid><description>&lt;p>If you need to find some similar objects in vector space, provided e.g. by embeddings or matching NN, you can choose among a variety of libraries: Annoy, FAISS or NMSLib.
All of them will give you a fast approximate neighbors search within almost any space.&lt;/p>
&lt;p>But what if you need to introduce some constraints in your search?
For example, you want search only for products in some category or select the most similar customer of a particular brand.
I did not find any simple solutions for this.
There are several discussions like &lt;a href="https://github.com/spotify/annoy/issues/263" target="_blank" rel="noopener nofollow">this&lt;/a>, but they only suggest to iterate over top search results and apply conditions consequently after the search.&lt;/p></description></item><item><title>Introducing Qdrant 0.11</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-0-11-release/</link><pubDate>Wed, 26 Oct 2022 13:55:00 +0200</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-0-11-release/</guid><description>&lt;p>We are excited to &lt;a href="https://github.com/qdrant/qdrant/releases/tag/v0.11.0" target="_blank" rel="noopener nofollow">announce the release of Qdrant v0.11&lt;/a>,
which introduces a number of new features and improvements.&lt;/p>
&lt;h2 id="replication">Replication&lt;/h2>
&lt;p>One of the key features in this release is replication support, which allows Qdrant to provide a high availability
setup with distributed deployment out of the box. This, combined with sharding, enables you to horizontally scale
both the size of your collections and the throughput of your cluster. This means that you can use Qdrant to handle
large amounts of data without sacrificing performance or reliability.&lt;/p></description></item><item><title>Qdrant 0.10 released</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-0-10-release/</link><pubDate>Mon, 19 Sep 2022 13:30:00 +0200</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-0-10-release/</guid><description>&lt;p>&lt;a href="https://github.com/qdrant/qdrant/releases/tag/v0.10.0" target="_blank" rel="noopener nofollow">Qdrant 0.10 is a new version&lt;/a> that brings a lot of performance
improvements, but also some new features which were heavily requested by our users. Here is an overview of what has changed.&lt;/p>
&lt;h2 id="storing-multiple-vectors-per-object">Storing multiple vectors per object&lt;/h2>
&lt;p>Previously, if you wanted to use semantic search with multiple vectors per object, you had to create separate collections
for each vector type. This was even if the vectors shared some other attributes in the payload. With Qdrant 0.10, you can
now store all of these vectors together in the same collection, which allows you to share a single copy of the payload.
This makes it easier to use semantic search with multiple vector types, and reduces the amount of work you need to do to
set up your collections.&lt;/p></description></item><item><title>Vector Search in constant time</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/quantum-quantization/</link><pubDate>Sat, 01 Apr 2023 00:48:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/quantum-quantization/</guid><description>&lt;p>The advent of quantum computing has revolutionized many areas of science and technology, and one of the most intriguing developments has been its potential application to artificial neural networks (ANNs). One area where quantum computing can significantly improve performance is in vector search, a critical component of many machine learning tasks. In this article, we will discuss the concept of quantum quantization for ANN vector search, focusing on the conversion of float32 to qbit vectors and the ability to perform vector search on arbitrary-sized databases in constant time.&lt;/p></description></item><item><title>Building Performant, Scaled Agentic Vector Search with Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/agentic-builders-guide/</link><pubDate>Sun, 26 Oct 2025 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/agentic-builders-guide/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>AI agents have grown from simple Q&amp;amp;A chatbots into systems that can independently plan, retrieve, act, and verify tasks. As developers work to recreate real-life workflows with agents, a common starting point is to give your agent access to a search API.&lt;/p>
&lt;p>&lt;img src="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles_data/agentic-builders-guide/agentic-architecture.png" alt="Agentic vector search architecture">&lt;/p>
&lt;h2 id="the-limitations-of-agents">The Limitations of Agents&lt;/h2>
&lt;p>While agents have proven they can create incredible impact, they still face serious limitations without the right tools. This is where a simple search box isn’t enough, and agents often fail when they move from prototype to production in three key areas:&lt;/p></description></item><item><title>MUVERA: Making Multivectors More Performant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/muvera-embeddings/</link><pubDate>Fri, 05 Sep 2025 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/muvera-embeddings/</guid><description>&lt;h2 id="what-are-muvera-embeddings">What are MUVERA Embeddings?&lt;/h2>
&lt;p>Multi-vector representations are superior to single-vector embeddings in many benchmarks. It might be tempting to use
them right away, but there is a catch: they are slower to search. Traditional vector search structures like
&lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/concepts/indexing/#vector-index">HNSW&lt;/a> are optimized for retrieving the nearest neighbors of a single
query vector using simple metrics such as cosine similarity. These indexes are not suitable for multi-vector retrieval
strategies, such as MaxSim, where a query and document are each represented by multiple vectors and the final score is
computed as the maximum similarity over all cross-pairings. MaxSim is inherently asymmetric and non-metric, so HNSW
could potentially help us find the closest document token to a given query token, but that does not mean the whole
document is the best hit for the query.&lt;/p></description></item><item><title>How to choose an embedding model</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/how-to-choose-an-embedding-model/</link><pubDate>Tue, 15 Jul 2025 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/how-to-choose-an-embedding-model/</guid><description>&lt;p>No matter if you are just beginning your journey in the world of vector search, or you are a seasoned practitioner, you
have probably wondered how to choose the right embedding model to achieve the best search quality. There are some
public benchmarks, such as &lt;a href="https://huggingface.co/spaces/mteb/leaderboard" target="_blank" rel="noopener nofollow">MTEB&lt;/a>, that can help you narrow down the
options, but datasets used in those benchmarks will rarely be representative of your domain-specific data. Moreover,
search quality is not the only requirement you could have. For example, some of the best models might be amazingly
accurate for retrieval, but you can&amp;rsquo;t afford to run them, e.g., due to high resource usage or your budget constraints.&lt;/p></description></item><item><title>Vector Search in Production</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-search-production/</link><pubDate>Wed, 30 Apr 2025 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/vector-search-production/</guid><description>&lt;h2 id="what-does-it-take-to-run-search-in-production">What Does it Take to Run Search in Production?&lt;/h2>
&lt;p>A mid-sized e-commerce company launched a vector search pilot to improve product discovery. During testing, everything ran smoothly. But in production, their queries began failing intermittently: memory errors, disk I/O spikes, and search delays sprang up unexpectedly.&lt;/p>
&lt;p>It turned out the team hadn&amp;rsquo;t adjusted the default configuration settings or reserved dedicated paths for write-ahead logs. Their vector index was too large to fit comfortably in RAM, and it frequently spilled to disk, causing slowdowns.&lt;/p></description></item><item><title>Semantic Cache: Accelerating AI with Lightning-Fast Data Retrieval</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/semantic-cache-ai-data-retrieval/</link><pubDate>Tue, 07 May 2024 00:00:00 -0800</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/semantic-cache-ai-data-retrieval/</guid><description>&lt;h2 id="what-is-semantic-cache">What is Semantic Cache?&lt;/h2>
&lt;p>&lt;strong>Semantic cache&lt;/strong> is a method of retrieval optimization, where similar queries instantly retrieve the same appropriate response from a knowledge base.&lt;/p>
&lt;p>Semantic cache differs from traditional caching methods. In computing, &lt;strong>cache&lt;/strong> refers to high-speed memory that efficiently stores frequently accessed data. In the context of &lt;a href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/what-is-a-vector-database/">vector databases&lt;/a>, a &lt;strong>semantic cache&lt;/strong> improves AI application performance by storing previously retrieved results along with the conditions under which they were computed. This allows the application to reuse those results when the same or similar conditions occur again, rather than finding them from scratch.&lt;/p></description></item><item><title>Full-text filter and index are already available!</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-introduces-full-text-filters-and-indexes/</link><pubDate>Wed, 16 Nov 2022 00:00:00 -0800</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/qdrant-introduces-full-text-filters-and-indexes/</guid><description>&lt;p>Qdrant is designed as an efficient vector database, allowing for a quick search of the nearest neighbours. But, you may find yourself in need of applying some extra filtering on top of the semantic search. Up to version 0.10, Qdrant was offering support for keywords only. Since 0.10, there is a possibility to apply full-text constraints as well. There is a new type of filter that you can use to do that, also combined with every other filter type.&lt;/p></description></item><item><title>Optimizing Semantic Search by Managing Multiple Vectors</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/storing-multiple-vectors-per-object-in-qdrant/</link><pubDate>Wed, 05 Oct 2022 00:00:00 -0800</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/storing-multiple-vectors-per-object-in-qdrant/</guid><description>&lt;h1 id="how-to-optimize-vector-storage-by-storing-multiple-vectors-per-object">How to Optimize Vector Storage by Storing Multiple Vectors Per Object&lt;/h1>
&lt;p>In a real case scenario, a single object might be described in several different ways. If you run an e-commerce business, then your items will typically have a name, longer textual description and also a bunch of photos. While cooking, you may care about the list of ingredients, and description of the taste but also the recipe and the way your meal is going to look. Up till now, if you wanted to enable &lt;a href="https://qdrant.tech/documentation/tutorials/search-beginners/" target="_blank" rel="noopener nofollow">semantic search&lt;/a> with multiple vectors per object, Qdrant would require you to create separate collections for each vector type, even though they could share some other attributes in a payload. However, since Qdrant 0.10 you are able to store all those vectors together in the same collection and share a single copy of the payload!&lt;/p></description></item><item><title>Mastering Batch Search for Vector Optimization</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/batch-vector-search-with-qdrant/</link><pubDate>Mon, 26 Sep 2022 00:00:00 -0800</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/articles/batch-vector-search-with-qdrant/</guid><description>&lt;h1 id="how-to-optimize-vector-search-using-batch-search-in-qdrant-0100">How to Optimize Vector Search Using Batch Search in Qdrant 0.10.0&lt;/h1>
&lt;p>The latest release of Qdrant 0.10.0 has introduced a lot of functionalities that simplify some common tasks. Those new possibilities come with some slightly modified interfaces of the client library. One of the recently introduced features is the possibility to query the collection with &lt;a href="https://qdrant.tech/blog/storing-multiple-vectors-per-object-in-qdrant/" target="_blank" rel="noopener nofollow">multiple vectors&lt;/a> at once — a batch search mechanism.&lt;/p>
&lt;p>There are a lot of scenarios in which you may need to perform multiple non-related tasks at the same time. Previously, you only could send several requests to Qdrant API on your own. But multiple parallel requests may cause significant network overhead and slow down the process, especially in case of poor connection speed.&lt;/p></description></item></channel></rss>