Files
LEANN/tests/sanity_checks/README_hnsw_pruning.md
2025-07-06 19:50:01 +00:00

2.0 KiB

HNSW Index Storage Optimization

This document explains the storage optimization features available in the HNSW backend.

Storage Modes

The HNSW backend supports two orthogonal optimization techniques:

1. CSR Compression (is_compact=True)

  • Converts the graph structure from standard format to Compressed Sparse Row (CSR) format
  • Reduces memory overhead from graph adjacency storage
  • Maintains all embedding data for direct access

2. Embedding Pruning (is_recompute=True)

  • Removes embedding vectors from the index file
  • Replaces them with a NULL storage marker
  • Requires recomputation via embedding server during search
  • Must be used with is_compact=True for efficiency

Performance Impact

Storage Reduction (100 vectors, 384 dimensions):

Standard format:     168 KB (embeddings + graph)
CSR only:           160 KB (embeddings + compressed graph)  
CSR + Pruned:         6 KB (compressed graph only)

Key Benefits:

  • CSR compression: ~5% size reduction from graph optimization
  • Embedding pruning: ~95% size reduction by removing embeddings
  • Combined: Up to 96% total storage reduction

Usage

# Standard format (largest)
builder = LeannBuilder(
    backend_name="hnsw",
    is_compact=False,
    is_recompute=False
)

# CSR compressed (medium)
builder = LeannBuilder(
    backend_name="hnsw", 
    is_compact=True,
    is_recompute=False
)

# CSR + Pruned (smallest, requires embedding server)
builder = LeannBuilder(
    backend_name="hnsw",
    is_compact=True,      # Required for pruning
    is_recompute=True     # Default: enabled
)

Trade-offs

Mode Storage Search Speed Memory Usage Setup Complexity
Standard Largest Fastest Highest Simple
CSR Medium Fast Medium Simple
CSR + Pruned Smallest Slower* Lowest Complex**

*Requires network round-trip to embedding server for recomputation
**Needs embedding server and passages file for search