How to Build Vector Search in Rust

You build vector search in Rust by using the `tantivy` crate, which provides a full-text search engine with vector similarity capabilities. Add the dependency to your `Cargo.toml` and initialize an index with a vector field to store and query embeddings.

When keywords aren't enough

You have a dataset of embeddings. Every document, image, or user profile is represented as a list of floating-point numbers. You want to find the items that are "closest" to a query vector. You aren't searching for the word "rust"; you're searching for the concept of memory safety, even if the document never mentions the word.

Text search engines match tokens. Vector search engines match geometry. The challenge is that computing distances between millions of vectors is expensive. You need an index that can answer "what is nearest?" without comparing your query to every single item in the database. Rust's tantivy crate solves this. It gives you a portable, high-performance search engine that handles vector similarity alongside traditional text search, all without requiring a separate database server.

Vectors as coordinates

Think of your data as points on a map. In text search, you look up a street name in a directory. In vector search, you stand at a location and ask for the nearest coffee shop. The "map" has many more dimensions than two, but the math is the same. Each dimension is a feature the model extracted. The distance between points measures similarity.

Two distance metrics dominate the field. Cosine similarity measures the angle between vectors. It ignores magnitude and focuses on direction. This is the standard for text embeddings because the length of the vector often correlates with document length, not meaning. Euclidean distance measures the straight-line distance. It cares about both direction and magnitude. Use Euclidean when the scale of the numbers matters, such as sensor readings or normalized image features.

tantivy supports both. You configure the metric when you define the vector field. The index stores the vectors and builds a structure that lets you find neighbors quickly. For small datasets, tantivy can do an exact scan. For larger datasets, it supports Approximate Nearest Neighbor (ANN) algorithms like HNSW, which trade a tiny bit of accuracy for massive speed gains.

Minimal vector index

Start with a schema that includes a vector field. The dimension must be fixed. Every vector you index must have the same number of elements. tantivy enforces this at write time. If you try to insert a vector with the wrong dimension, the index rejects it.

Add tantivy to your dependencies. Vector search support is stable in recent versions. Check your version against the docs if you see missing methods.

[dependencies]
tantivy = "0.22"

The minimal example creates an in-memory index, adds a vector field, indexes a few documents, and queries for the nearest neighbor.

use tantivy::schema::*;
use tantivy::*;
use tantivy::query::VectorQuery;

/// Demonstrates a basic vector index with cosine similarity.
fn main() -> tantivy::Result<()> {
    let mut schema_builder = Schema::builder();
    
    // Vector fields store fixed-size arrays of f32.
    // The dimension is locked when the field is created.
    // All indexed vectors must match this dimension exactly.
    let vec_field = schema_builder.add_vector_field("embedding", Type::F32);
    let schema = schema_builder.build();

    // In-memory index for demonstration.
    // Use Index::open_or_create in production for persistence.
    let index = Index::create_in_ram(schema);
    let mut writer = index.writer(50_000_000)?;

    // Index documents with vectors.
    // Dimensions must be consistent. Here we use 3D vectors.
    let mut doc1 = Document::new();
    doc1.add_f32s(vec_field, &[0.1, 0.2, 0.3]);
    writer.add_document(doc1)?;

    let mut doc2 = Document::new();
    doc2.add_f32s(vec_field, &[0.9, 0.8, 0.7]);
    writer.add_document(doc2)?;

    writer.commit()?;

    // Search for the nearest vector.
    // VectorQuery finds documents with embeddings closest to the query.
    let reader = index.reader()?;
    let searcher = reader.searcher();
    
    // Query vector is close to doc1.
    let query = VectorQuery::new_nearest(vec_field, &[0.11, 0.21, 0.31]);
    let top_docs = searcher.search(&query, &TopDocs::with_limit(5))?;

    println!("Found {} results", top_docs.len());
    for (score, doc_address) in top_docs {
        let retrieved_doc = searcher.doc(doc_address)?;
        println!("Score: {}, Doc: {:?}", score, retrieved_doc);
    }

    Ok(())
}

The code defines a vector field with add_vector_field. The Type::F32 specifies the element type. You index vectors using add_f32s. The query uses VectorQuery::new_nearest, which returns documents sorted by similarity score. The score is typically the distance or a transformed similarity value. Higher scores mean closer matches.

How the index works

When you call writer.add_document, tantivy serializes the vector and appends it to a fast field. Fast fields are column-oriented storage that loads quickly into memory. During search, tantivy loads the vector data and computes distances.

If you use the default settings, tantivy performs an exact nearest neighbor search. It computes the distance to every vector and returns the top results. This is accurate but scales linearly with the dataset size. For a million vectors, exact search takes time.

To scale, you enable an ANN index. tantivy supports HNSW (Hierarchical Navigable Small World) for vector fields. HNSW builds a graph structure where vectors are connected to their nearest neighbors. Search traverses the graph, hopping from node to node until it finds the region containing the query. This reduces search time from linear to logarithmic.

You configure HNSW when building the index. The trade-off is index size and build time. HNSW indexes are larger than flat storage and take longer to build. Query time drops dramatically. Choose based on your latency requirements.

// Configure HNSW for approximate nearest neighbor search.
// This reduces query latency at the cost of index size and build time.
let mut schema_builder = Schema::builder();
let vec_field = schema_builder.add_vector_field_with_options(
    "embedding",
    VectorFieldIndexing::default()
        .set_indexing_hnsw(), // Enable HNSW graph
);

Convention aside: the community often normalizes vectors before indexing. Cosine similarity is equivalent to dot product on normalized vectors. If you normalize your embeddings, you can use dot product as the metric, which is faster. tantivy doesn't normalize for you. Handle normalization in your embedding pipeline.

Hybrid search in practice

Real-world search rarely uses vectors alone. You usually combine text filtering with vector similarity. A user searches for "rust compiler error". You want documents that match the keywords and are semantically similar to the query.

tantivy lets you combine queries. You can use a BooleanQuery to require a text match and boost by vector similarity. Or you can use vector search to retrieve candidates and re-rank them with text scores. The hybrid approach gives you the precision of keywords and the recall of vectors.

The realistic example below indexes documents with both text and vectors. It searches for a text term and finds the nearest vectors among the matches.

use tantivy::query::{BooleanQuery, Occur, Query, TermQuery, VectorQuery};
use tantivy::schema::*;
use tantivy::*;

/// Searches for documents matching a text term, ranked by vector similarity.
/// This pattern combines keyword precision with semantic recall.
fn hybrid_search(
    index: &Index,
    text_field: Field,
    vec_field: Field,
    query_text: &str,
    query_vec: &[f32],
) -> tantivy::Result<Vec<(f64, DocAddress)>> {
    let reader = index.reader()?;
    let searcher = reader.searcher();

    // TermQuery matches the exact text token.
    // In production, use a QueryParser for complex text queries.
    let term = Term::from_field_text(text_field, query_text);
    let text_query = Box::new(TermQuery::new(term, IndexRecordOption::Basic));

    // VectorQuery finds nearest neighbors.
    let vec_query = Box::new(VectorQuery::new_nearest(vec_field, query_vec));

    // BooleanQuery combines the queries.
    // MUST requires the text match.
    // SHOULD boosts the score by vector similarity.
    let hybrid_query = BooleanQuery::new(vec![
        (Occur::Must, text_query),
        (Occur::Should, vec_query),
    ]);

    let top_docs = searcher.search(&hybrid_query, &TopDocs::with_limit(10))?;
    Ok(top_docs)
}

fn main() -> tantivy::Result<()> {
    let mut schema_builder = Schema::builder();
    let text_field = schema_builder.add_text_field("content", TEXT | STORED);
    let vec_field = schema_builder.add_vector_field("embedding", Type::F32);
    let schema = schema_builder.build();

    let index = Index::create_in_ram(schema);
    let mut writer = index.writer(50_000_000)?;

    // Index documents with text and vectors.
    let mut doc = Document::new();
    doc.add_text(text_field, "Rust compiler errors are helpful");
    doc.add_f32s(vec_field, &[0.1, 0.2, 0.3]);
    writer.add_document(doc)?;

    writer.commit()?;

    // Search for "compiler" and rank by vector similarity.
    let results = hybrid_search(
        &index,
        text_field,
        vec_field,
        "compiler",
        &[0.11, 0.21, 0.31],
    )?;

    println!("Hybrid search found {} results", results.len());
    Ok(())
}

The BooleanQuery uses Occur::Must for the text match. This filters out documents that don't contain the keyword. Occur::Should adds the vector score as a boost. Documents that match the text and are close in vector space rank higher. This pattern is standard for hybrid search. It prevents semantic drift while keeping results relevant.

Pitfalls and compiler errors

Vector search introduces specific failure modes. The compiler catches type mismatches, but runtime errors require attention.

Dimension mismatch is the most common error. If you index a vector with 768 dimensions and query with 512, tantivy rejects the operation. The error message tells you the expected dimension. Fix the embedding model or the query pipeline. Don't mix models.

// tantivy returns an error like:
// "Vector dimension mismatch: expected 768, got 512"

Unnormalized vectors break cosine similarity. If your vectors have different magnitudes, cosine similarity behaves like a weighted mix of angle and length. Normalize your vectors to unit length before indexing. The math assumes unit vectors. If you skip normalization, your results will be skewed toward longer vectors.

Memory usage scales with vector size. A million 768-dimensional f32 vectors take about 3GB of RAM. tantivy loads fast fields into memory. Plan your index size. Use ANN to reduce memory pressure if needed. HNSW indexes are larger than flat storage, but they allow you to store more vectors within latency constraints.

Compiler errors appear when you misuse the API. If you pass a slice of the wrong type, you get E0308 (mismatched types). If you forget to import VectorQuery, you get a name resolution error. The compiler helps you here. Read the error messages. They point to the exact line and type.

// This causes E0308: mismatched types.
// add_f32s expects &[f32], not &[i32].
// doc.add_f32s(vec_field, &[1, 2, 3]); // Error

Convention aside: keep vector fields separate from text fields. Don't store vectors as JSON strings in a text field. The index won't understand the geometry, and search will fail. Use the dedicated vector field type. It optimizes storage and query performance.

When to use vector search

Vector search isn't a replacement for text search. It's a tool for semantic matching. Choose the right tool based on your data and requirements.

Use tantivy when you need a self-contained search engine with vector support. It runs in your process, requires no external services, and produces portable index files. It handles hybrid search well. It's the right choice for applications that embed search directly, such as desktop apps, CLI tools, or microservices that manage their own data.

Use a dedicated vector database client when you need distributed scaling, persistence across restarts without re-indexing, or managed infrastructure. Clients for Qdrant, Milvus, or Weaviate give you horizontal scaling and advanced features like filtering on metadata. Reach for these when your dataset exceeds available RAM or when you need high availability.

Use ndarray with a simple loop when your dataset fits in a few megabytes and you want zero external dependencies. Compute distances manually. This works for prototypes and tiny datasets. It doesn't scale. Switch to tantivy when performance matters.

Reach for tantivy's text search when you need keyword filtering alongside vectors. Hybrid search gives you the best of both worlds. Don't rely on vectors alone for precise filtering. Keywords anchor the search. Vectors rank the results.

Where to go next

Vector search is one piece of the AI stack in Rust. You often generate embeddings using a model before indexing them. Explore these related topics to build a complete pipeline.

Normalize your vectors before indexing. The math doesn't care about your excuses. Treat the dimension as a contract. One mismatched vector breaks the index. Trust the borrow checker when you manage index writers. It keeps your data safe.