Vector Database Benchmarks on Qdrant - Vector Database

How vector search should be benchmarked?

info@qdrant.tech (Andrey Vasnetsov) — Mon, 01 Jan 0001 00:00:00 +0000

Benchmarking Vector Databases

At Qdrant, performance is the top-most priority. We always make sure that we use system resources efficiently so you get the fastest and most accurate results at the cheapest cloud costs. So all of our decisions from choosing Rust, io optimisations, serverless support, binary quantization, to our fastembed library are all based on our principle. In this article, we will compare how Qdrant performs against the other vector search engines.

Here are the principles we followed while designing these benchmarks:

Single node benchmarks

info@qdrant.tech (Andrey Vasnetsov) — Tue, 23 Aug 2022 00:00:00 +0000

Observations

Most of the engines have improved since our last run. Both life and software have trade-offs but some clearly do better:

Qdrant achives highest RPS and lowest latencies in almost all the scenarios, no matter the precision threshold and the metric we choose. It has also shown 4x RPS gains on one of the datasets.
Elasticsearch has become considerably fast for many cases but it’s very slow in terms of indexing time. It can be 10x slower when storing 10M+ vectors of 96 dimensions! (32mins vs 5.5 hrs)
Milvus is the fastest when it comes to indexing time and maintains good precision. However, it’s not on-par with others when it comes to RPS or latency when you have higher dimension embeddings or more number of vectors.
Redis is able to achieve good RPS but mostly for lower precision. It also achieved low latency with single thread, however its latency goes up quickly with more parallel requests. Part of this speed gain comes from their custom protocol.
Weaviate has improved the least since our last run.

How to read the results

Choose the dataset and the metric you want to check.
Select a precision threshold that would be satisfactory for your usecase. This is important because ANN search is all about trading precision for speed. This means in any vector search benchmark, two results must be compared only when you have similar precision. However most benchmarks miss this critical aspect.
The table is sorted by the value of the selected metric (RPS / Latency / p95 latency / Index time), and the first entry is always the winner of the category 🏆

Latency vs RPS

In our benchmark we test two main search usage scenarios that arise in practice.

Single node benchmarks (2022)

info@qdrant.tech (Andrey Vasnetsov) — Tue, 23 Aug 2022 00:00:00 +0000

This is an archived version of Single node benchmarks. Please refer to the new version here.

Filtered search benchmark

info@qdrant.tech (Andrey Vasnetsov) — Mon, 13 Feb 2023 00:00:00 +0000

Filtered search benchmark

Applying filters to search results brings a whole new level of complexity. It is no longer enough to apply one algorithm to plain data. With filtering, it becomes a matter of the cross-integration of the different indices.

To measure how well different search engines perform in this scenario, we have prepared a set of Filtered ANN Benchmark Datasets - https://github.com/qdrant/ann-filtering-benchmark-datasets

It is similar to the ones used in the ann-benchmarks project but enriched with payload metadata and pre-generated filtering requests. It includes synthetic and real-world datasets with various filters, from keywords to geo-spatial queries.

info@qdrant.tech (Andrey Vasnetsov) — Mon, 13 Feb 2023 00:00:00 +0000

Filtered Results

As you can see from the charts, there are three main patterns:

Speed boost - for some engines/queries, the filtered search is faster than the unfiltered one. It might happen if the filter is restrictive enough, to completely avoid the usage of the vector index.
Speed downturn - some engines struggle to keep high RPS, it might be related to the requirement of building a filtering mask for the dataset, as described above.

Benchmarks F.A.Q.

info@qdrant.tech (Andrey Vasnetsov) — Mon, 01 Jan 0001 00:00:00 +0000

Benchmarks F.A.Q.

Are we biased?

Probably, yes. Even if we try to be objective, we are not experts in using all the existing vector databases. We build Qdrant and know the most about it. Due to that, we could have missed some important tweaks in different vector search engines.

However, we tried our best, kept scrolling the docs up and down, experimented with combinations of different configurations, and gave all of them an equal chance to stand out. If you believe you can do it better than us, our benchmarks are fully open-sourced, and contributions are welcome!