HNSW Index Storage Optimization

This document explains the storage optimization features available in the HNSW backend.

Storage Modes

The HNSW backend supports two orthogonal optimization techniques:

1. CSR Compression (`is_compact=True`)

Converts the graph structure from standard format to Compressed Sparse Row (CSR) format
Reduces memory overhead from graph adjacency storage
Maintains all embedding data for direct access

2. Embedding Pruning (`is_recompute=True`)

Removes embedding vectors from the index file
Replaces them with a NULL storage marker
Requires recomputation via embedding server during search
Must be used with is_compact=True for efficiency

Performance Impact

Storage Reduction (100 vectors, 384 dimensions):

Standard format:     168 KB (embeddings + graph)
CSR only:           160 KB (embeddings + compressed graph)  
CSR + Pruned:         6 KB (compressed graph only)

Key Benefits:

CSR compression: ~5% size reduction from graph optimization
Embedding pruning: ~95% size reduction by removing embeddings
Combined: Up to 96% total storage reduction

Usage

# Standard format (largest)
builder = LeannBuilder(
    backend_name="hnsw",
    is_compact=False,
    is_recompute=False
)

# CSR compressed (medium)
builder = LeannBuilder(
    backend_name="hnsw", 
    is_compact=True,
    is_recompute=False
)

# CSR + Pruned (smallest, requires embedding server)
builder = LeannBuilder(
    backend_name="hnsw",
    is_compact=True,      # Required for pruning
    is_recompute=True     # Default: enabled
)

Trade-offs

Mode	Storage	Search Speed	Memory Usage	Setup Complexity
Standard	Largest	Fastest	Highest	Simple
CSR	Medium	Fast	Medium	Simple
CSR + Pruned	Smallest	Slower*	Lowest	Complex**

*Requires network round-trip to embedding server for recomputation
**Needs embedding server and passages file for search

2.0 KiB Raw Blame History