<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>FastEmbed on Qdrant - Vector Database</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/</link><description>Recent content in FastEmbed on Qdrant - Vector Database</description><generator>Hugo</generator><language>en-us</language><managingEditor>info@qdrant.tech (Andrey Vasnetsov)</managingEditor><webMaster>info@qdrant.tech (Andrey Vasnetsov)</webMaster><atom:link href="https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/index.xml" rel="self" type="application/rss+xml"/><item><title>Quickstart</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-quickstart/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-quickstart/</guid><description>&lt;h1 id="how-to-generate-text-embedings-with-fastembed">How to Generate Text Embedings with FastEmbed&lt;/h1>
&lt;h2 id="install-fastembed">Install FastEmbed&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">pip&lt;/span> &lt;span class="n">install&lt;/span> &lt;span class="n">fastembed&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Just for demo purposes, you will use Lists and NumPy to work with sample data.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">typing&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">List&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">import&lt;/span> &lt;span class="nn">numpy&lt;/span> &lt;span class="k">as&lt;/span> &lt;span class="nn">np&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="load-default-model">Load default model&lt;/h2>
&lt;p>In this example, you will use the default text embedding model, &lt;code>BAAI/bge-small-en-v1.5&lt;/code>.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">fastembed&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">TextEmbedding&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="add-sample-data">Add sample data&lt;/h2>
&lt;p>Now, add two sample documents. Your documents must be in a list, and each document must be a string&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">documents&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">List&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="nb">str&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;FastEmbed is lighter than Transformers &amp;amp; Sentence-Transformers.&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;FastEmbed is supported by and maintained by Qdrant.&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Download and initialize the model. Print a message to verify the process.&lt;/p></description></item><item><title>FastEmbed &amp; Qdrant</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-semantic-search/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-semantic-search/</guid><description>&lt;h1 id="using-fastembed-with-qdrant-for-vector-search">Using FastEmbed with Qdrant for Vector Search&lt;/h1>
&lt;h2 id="install-qdrant-client-and-fastembed">Install Qdrant Client and FastEmbed&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">pip&lt;/span> &lt;span class="n">install&lt;/span> &lt;span class="s2">&amp;#34;qdrant-client[fastembed]&amp;gt;=1.14.2&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="initialize-the-client">Initialize the client&lt;/h2>
&lt;p>Qdrant Client has a simple in-memory mode that lets you try semantic search locally.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">qdrant_client&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">QdrantClient&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">models&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">client&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">QdrantClient&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;:memory:&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># Qdrant is running from RAM.&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="add-data">Add data&lt;/h2>
&lt;p>Now you can add two sample documents, their associated metadata, and a point &lt;code>id&lt;/code> for each.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">docs&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;Qdrant has a LangChain integration for chatbots.&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s2">&amp;#34;Qdrant has a LlamaIndex integration for agents.&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">metadata&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>&lt;span class="s2">&amp;#34;source&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;langchain-docs&amp;#34;&lt;/span>&lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">{&lt;/span>&lt;span class="s2">&amp;#34;source&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;llamaindex-docs&amp;#34;&lt;/span>&lt;span class="p">},&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">ids&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="mi">42&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">2&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="create-a-collection">Create a collection&lt;/h2>
&lt;p>Qdrant stores vectors and associated metadata in collections.
Collection requires vector parameters to be set during creation.
In this tutorial, we&amp;rsquo;ll be using &lt;code>BAAI/bge-small-en&lt;/code> to compute embeddings.&lt;/p></description></item><item><title>Working with miniCOIL</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-minicoil/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-minicoil/</guid><description>&lt;h1 id="how-to-use-minicoil-qdrants-sparse-neural-retriever">How to use miniCOIL, Qdrant&amp;rsquo;s Sparse Neural Retriever&lt;/h1>
&lt;p>&lt;strong>miniCOIL&lt;/strong> is an open-sourced sparse neural retrieval model that acts as if a BM25-based retriever understood the contextual meaning of keywords and ranked results accordingly.&lt;/p>
&lt;p>&lt;strong>miniCOIL&lt;/strong> scoring is based on the BM25 formula scaled by the semantic similarity between matched keywords in a query and a document.
$$
\text{miniCOIL}(D,Q) = \sum_{i=1}^{N} \text{IDF}(q_i) \cdot \text{Importance}^{q_i}_{D} \cdot {\color{YellowGreen}\text{Meaning}^{q_i \times d_j}} \text{, where keyword } d_j \in D \text{ equals } q_i
$$&lt;/p></description></item><item><title>Working with SPLADE</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-splade/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-splade/</guid><description>&lt;h1 id="how-to-generate-sparse-vectors-with-splade">How to Generate Sparse Vectors with SPLADE&lt;/h1>
&lt;p>SPLADE is a novel method for learning sparse text representation vectors, outperforming BM25 in tasks like information retrieval and document classification. Its main advantage is generating efficient and interpretable sparse vectors, making it effective for large-scale text data.&lt;/p>
&lt;h2 id="setup">Setup&lt;/h2>
&lt;p>First, install FastEmbed.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">pip&lt;/span> &lt;span class="n">install&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">q&lt;/span> &lt;span class="n">fastembed&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Next, import the required modules for sparse embeddings and Python’s typing module.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">fastembed&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">SparseTextEmbedding&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">SparseEmbedding&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You may always check the list of all supported sparse embedding models.&lt;/p></description></item><item><title>Working with ColBERT</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-colbert/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-colbert/</guid><description>&lt;h1 id="how-to-generate-colbert-multivectors-with-fastembed">How to Generate ColBERT Multivectors with FastEmbed&lt;/h1>
&lt;h2 id="colbert">ColBERT&lt;/h2>
&lt;p>ColBERT is an embedding model that produces a matrix (multivector) representation of input text,
generating one vector per token (a token being a meaningful text unit for a machine learning model).
This approach allows ColBERT to capture more nuanced input semantics than many dense embedding models,
which represent an entire input with a single vector. By producing more granular input representations,
ColBERT becomes a strong retriever. However, this advantage comes at the cost of increased resource consumption compared to
traditional dense embedding models, both in terms of speed and memory.&lt;/p></description></item><item><title>Reranking with FastEmbed</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-rerankers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-rerankers/</guid><description>&lt;h1 id="how-to-use-rerankers-with-fastembed">How to use rerankers with FastEmbed&lt;/h1>
&lt;h2 id="rerankers">Rerankers&lt;/h2>
&lt;p>A reranker is a model that improves the ordering of search results. A subset of documents is initially retrieved using a fast, simple method (e.g., BM25 or dense embeddings). Then, a reranker &amp;ndash; a more powerful, precise, but slower and heavier model &amp;ndash; re-evaluates this subset to refine document relevance to the query.&lt;/p>
&lt;p>Rerankers analyze token-level interactions between the query and each document in depth, making them expensive to use but precise in defining relevance. They trade speed for accuracy, so they are best used on a limited candidate set rather than the entire corpus.&lt;/p></description></item><item><title>Multi-Vector Postprocessing</title><link>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-postprocessing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><author>info@qdrant.tech (Andrey Vasnetsov)</author><guid>https://deploy-preview-2138--condescending-goldwasser-91acf0.netlify.app/documentation/fastembed/fastembed-postprocessing/</guid><description>&lt;h1 id="multi-vector-postprocessing">Multi-Vector Postprocessing&lt;/h1>
&lt;p>FastEmbed&amp;rsquo;s postprocessing module provides techniques for transforming and optimizing embeddings after generation. These
postprocessing methods can improve search performance, reduce storage requirements, or adapt embeddings for specific use
cases.&lt;/p>
&lt;p>Currently, the postprocessing module includes MUVERA (Multi-Vector Retrieval Algorithm) for speeding up multi-vector
embeddings. Additional postprocessing techniques are planned for future releases.&lt;/p>
&lt;h2 id="muvera">MUVERA&lt;/h2>
&lt;p>MUVERA transforms variable-length sequences of vectors into fixed-dimensional single-vector representations. These
approximations can be used for fast initial retrieval using traditional vector search methods like HNSW. Once you&amp;rsquo;ve
retrieved a small set of candidates quickly, you can then rerank them using the original multi-vector representations
for maximum accuracy.&lt;/p></description></item></channel></rss>