merge: finalize compat resolution (delegate to PassageManager; keep relative hints in meta); resolve conflicts

This commit is contained in:
Andy Lee
2025-08-14 01:09:39 -07:00
25 changed files with 1911 additions and 477 deletions

View File

@@ -52,7 +52,7 @@ Based on our experience developing LEANN, embedding models fall into three categ
### Quick Start: Cloud and Local Embedding Options
**OpenAI Embeddings (Fastest Setup)**
For immediate testing without local model downloads:
For immediate testing without local model downloads(also if you [do not have GPU](https://github.com/yichuan-w/LEANN/issues/43) and do not care that much about your document leak, you should use this, we compute the embedding and recompute using openai API):
```bash
# Set OpenAI embeddings (requires OPENAI_API_KEY)
--embedding-mode openai --embedding-model text-embedding-3-small
@@ -97,16 +97,30 @@ ollama pull nomic-embed-text
```
### DiskANN
**Best for**: Large datasets (> 10M vectors, 10GB+ index size) - **⚠️ Beta version, still in active development**
- Uses Product Quantization (PQ) for coarse filtering during graph traversal
- Novel approach: stores only PQ codes, performs rerank with exact computation in final step
- Implements a corner case of double-queue: prunes all neighbors and recomputes at the end
**Best for**: Performance-critical applications and large datasets - **Production-ready with automatic graph partitioning**
**How it works:**
- **Product Quantization (PQ) + Real-time Reranking**: Uses compressed PQ codes for fast graph traversal, then recomputes exact embeddings for final candidates
- **Automatic Graph Partitioning**: When `is_recompute=True`, automatically partitions large indices and safely removes redundant files to save storage
- **Superior Speed-Accuracy Trade-off**: Faster search than HNSW while maintaining high accuracy
**Trade-offs compared to HNSW:**
-**Faster search latency** (typically 2-8x speedup)
-**Better scaling** for large datasets
-**Smart storage management** with automatic partitioning
-**Better graph locality** with `--ldg-times` parameter for SSD optimization
- ⚠️ **Slightly larger index size** due to PQ tables and graph metadata
```bash
# For billion-scale deployments
# Recommended for most use cases
--backend-name diskann --graph-degree 32 --build-complexity 64
# For large-scale deployments
--backend-name diskann --graph-degree 64 --build-complexity 128
```
**Performance Benchmark**: Run `python benchmarks/diskann_vs_hnsw_speed_comparison.py` to compare DiskANN and HNSW on your system.
## LLM Selection: Engine and Model Comparison
### LLM Engines
@@ -352,3 +366,4 @@ Trade-offs:
- [Lessons Learned Developing LEANN](https://yichuan-w.github.io/blog/lessons_learned_in_dev_leann/)
- [LEANN Technical Paper](https://arxiv.org/abs/2506.08276)
- [DiskANN Original Paper](https://papers.nips.cc/paper/2019/file/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Paper.pdf)
- [SSD-based Graph Partitioning](https://github.com/SonglinLife/SSD_BASED_PLAN)