docs: embedding pruning
This commit is contained in:
68
tests/sanity_checks/README_hnsw_pruning.md
Normal file
68
tests/sanity_checks/README_hnsw_pruning.md
Normal file
@@ -0,0 +1,68 @@
|
||||
# HNSW Index Storage Optimization
|
||||
|
||||
This document explains the storage optimization features available in the HNSW backend.
|
||||
|
||||
## Storage Modes
|
||||
|
||||
The HNSW backend supports two orthogonal optimization techniques:
|
||||
|
||||
### 1. CSR Compression (`is_compact=True`)
|
||||
- Converts the graph structure from standard format to Compressed Sparse Row (CSR) format
|
||||
- Reduces memory overhead from graph adjacency storage
|
||||
- Maintains all embedding data for direct access
|
||||
|
||||
### 2. Embedding Pruning (`is_recompute=True`)
|
||||
- Removes embedding vectors from the index file
|
||||
- Replaces them with a NULL storage marker
|
||||
- Requires recomputation via embedding server during search
|
||||
- Must be used with `is_compact=True` for efficiency
|
||||
|
||||
## Performance Impact
|
||||
|
||||
**Storage Reduction (100 vectors, 384 dimensions):**
|
||||
```
|
||||
Standard format: 168 KB (embeddings + graph)
|
||||
CSR only: 160 KB (embeddings + compressed graph)
|
||||
CSR + Pruned: 6 KB (compressed graph only)
|
||||
```
|
||||
|
||||
**Key Benefits:**
|
||||
- **CSR compression**: ~5% size reduction from graph optimization
|
||||
- **Embedding pruning**: ~95% size reduction by removing embeddings
|
||||
- **Combined**: Up to 96% total storage reduction
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
# Standard format (largest)
|
||||
builder = LeannBuilder(
|
||||
backend_name="hnsw",
|
||||
is_compact=False,
|
||||
is_recompute=False
|
||||
)
|
||||
|
||||
# CSR compressed (medium)
|
||||
builder = LeannBuilder(
|
||||
backend_name="hnsw",
|
||||
is_compact=True,
|
||||
is_recompute=False
|
||||
)
|
||||
|
||||
# CSR + Pruned (smallest, requires embedding server)
|
||||
builder = LeannBuilder(
|
||||
backend_name="hnsw",
|
||||
is_compact=True, # Required for pruning
|
||||
is_recompute=True # Default: enabled
|
||||
)
|
||||
```
|
||||
|
||||
## Trade-offs
|
||||
|
||||
| Mode | Storage | Search Speed | Memory Usage | Setup Complexity |
|
||||
|------|---------|--------------|--------------|------------------|
|
||||
| Standard | Largest | Fastest | Highest | Simple |
|
||||
| CSR | Medium | Fast | Medium | Simple |
|
||||
| CSR + Pruned | Smallest | Slower* | Lowest | Complex** |
|
||||
|
||||
*Requires network round-trip to embedding server for recomputation
|
||||
**Needs embedding server and passages file for search
|
||||
Reference in New Issue
Block a user