diff --git a/README.md b/README.md index 3e8786b..4b89db7 100755 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ LEANN achieves this through *graph-based selective recomputation* with *high-deg LEANN vs Traditional Vector DB Storage Comparison

-> **The numbers speak for themselves:** Index 60 million Wikipedia chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-usage-comparison) +> **The numbers speak for themselves:** Index 60 million Wikipedia chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-comparison-on-different-applications) πŸ”’ **Privacy:** Your data never leaves your laptop. No OpenAI, no cloud, no "terms of service". @@ -402,7 +402,7 @@ Options: πŸ“Š **[Simple Example: Compare LEANN vs FAISS β†’](examples/compare_faiss_vs_leann.py)** -### Storage Comparison +### Storage Comparison on Different Applications {#storage-comparison-on-different-applications} | System | DPR (2.1M) | Wiki (60M) | Chat (400K) | Email (780K) | Browser (38K) | |--------|-------------|------------|-------------|--------------|---------------| @@ -439,98 +439,15 @@ If you find Leann useful, please cite: } ``` -## ✨ Features +## ✨ [Detailed Features β†’](docs/features.md) -### πŸ”₯ Core Features - -- **πŸ”„ Real-time Embeddings** - Eliminate heavy embedding storage with dynamic computation using optimized ZMQ servers and highly optimized search paradigm (overlapping and batching) with highly optimized embedding engine -- **πŸ“ˆ Scalable Architecture** - Handles millions of documents on consumer hardware; the larger your dataset, the more LEANN can save -- **🎯 Graph Pruning** - Advanced techniques to minimize the storage overhead of vector search to a limited footprint -- **πŸ—οΈ Pluggable Backends** - DiskANN, HNSW/FAISS with unified API - -### πŸ› οΈ Technical Highlights -- **πŸ”„ Recompute Mode** - Highest accuracy scenarios while eliminating vector storage overhead -- **⚑ Zero-copy Operations** - Minimize IPC overhead by transferring distances instead of embeddings -- **πŸš€ High-throughput Embedding Pipeline** - Optimized batched processing for maximum efficiency -- **🎯 Two-level Search** - Novel coarse-to-fine search overlap for accelerated query processing (optional) -- **πŸ’Ύ Memory-mapped Indices** - Fast startup with raw text mapping to reduce memory overhead -- **πŸš€ MLX Support** - Ultra-fast recompute/build with quantized embedding models, accelerating building and search ([minimal example](test/build_mlx_index.py)) - -### 🎨 Developer Experience - -- **Simple Python API** - Get started in minutes -- **Extensible backend system** - Easy to add new algorithms -- **Comprehensive examples** - From basic usage to production deployment - -## 🀝 Contributing - -We welcome contributions! Leann is built by the community, for the community. - -### Ways to Contribute - -- πŸ› **Bug Reports**: Found an issue? Let us know! -- πŸ’‘ **Feature Requests**: Have an idea? We'd love to hear it! -- πŸ”§ **Code Contributions**: PRs welcome for all skill levels -- πŸ“– **Documentation**: Help make Leann more accessible -- πŸ§ͺ **Benchmarks**: Share your performance results +## 🀝 [Contributing β†’](docs/contributing.md) - -## FAQ - -### 1. My building time seems long - -You can speed up the process by using a lightweight embedding model. Add this to your arguments: - -```bash ---embedding-model sentence-transformers/all-MiniLM-L6-v2 -``` -**Model sizes:** `all-MiniLM-L6-v2` (30M parameters), `facebook/contriever` (~100M parameters), `Qwen3-0.6B` (600M parameters) +## [FAQ β†’](docs/faq.md) -## πŸ“ˆ Roadmap - -### 🎯 Q2 2025 - -- [X] DiskANN backend with MIPS/L2/Cosine support -- [X] HNSW backend integration -- [X] Real-time embedding pipeline -- [X] Memory-efficient graph pruning - -### πŸš€ Q3 2025 - - -- [ ] Advanced caching strategies -- [ ] Add contextual-retrieval https://www.anthropic.com/news/contextual-retrieval -- [ ] Add sleep-time-compute and summarize agent! to summarilze the file on computer! -- [ ] Add OpenAI recompute API - -### 🌟 Q4 2025 - -- [ ] Integration with LangChain/LlamaIndex -- [ ] Visual similarity search -- [ ] Query rewrtiting, rerank and expansion +## πŸ“ˆ [Roadmap β†’](docs/roadmap.md) ## πŸ“„ License @@ -538,11 +455,7 @@ MIT License - see [LICENSE](LICENSE) for details. ## πŸ™ Acknowledgments -- **Microsoft Research** for the DiskANN algorithm -- **Meta AI** for FAISS and optimization insights -- **HuggingFace** for the transformer ecosystem -- **Our amazing contributors** who make this possible - +This work is done atΒ [**Berkeley Sky Computing Lab**](https://sky.cs.berkeley.edu/) ---

diff --git a/docs/contributing.md b/docs/contributing.md new file mode 100644 index 0000000..e8d262c --- /dev/null +++ b/docs/contributing.md @@ -0,0 +1,11 @@ +# 🀝 Contributing + +We welcome contributions! Leann is built by the community, for the community. + +## Ways to Contribute + +- πŸ› **Bug Reports**: Found an issue? Let us know! +- πŸ’‘ **Feature Requests**: Have an idea? We'd love to hear it! +- πŸ”§ **Code Contributions**: PRs welcome for all skill levels +- πŸ“– **Documentation**: Help make Leann more accessible +- πŸ§ͺ **Benchmarks**: Share your performance results \ No newline at end of file diff --git a/docs/faq.md b/docs/faq.md new file mode 100644 index 0000000..ba06e1a --- /dev/null +++ b/docs/faq.md @@ -0,0 +1,10 @@ +# FAQ + +## 1. My building time seems long + +You can speed up the process by using a lightweight embedding model. Add this to your arguments: + +```bash +--embedding-model sentence-transformers/all-MiniLM-L6-v2 +``` +**Model sizes:** `all-MiniLM-L6-v2` (30M parameters), `facebook/contriever` (~100M parameters), `Qwen3-0.6B` (600M parameters) \ No newline at end of file diff --git a/docs/features.md b/docs/features.md new file mode 100644 index 0000000..a0abf85 --- /dev/null +++ b/docs/features.md @@ -0,0 +1,22 @@ +# ✨ Detailed Features + +## πŸ”₯ Core Features + +- **πŸ”„ Real-time Embeddings** - Eliminate heavy embedding storage with dynamic computation using optimized ZMQ servers and highly optimized search paradigm (overlapping and batching) with highly optimized embedding engine +- **πŸ“ˆ Scalable Architecture** - Handles millions of documents on consumer hardware; the larger your dataset, the more LEANN can save +- **🎯 Graph Pruning** - Advanced techniques to minimize the storage overhead of vector search to a limited footprint +- **πŸ—οΈ Pluggable Backends** - DiskANN, HNSW/FAISS with unified API + +## πŸ› οΈ Technical Highlights +- **πŸ”„ Recompute Mode** - Highest accuracy scenarios while eliminating vector storage overhead +- **⚑ Zero-copy Operations** - Minimize IPC overhead by transferring distances instead of embeddings +- **πŸš€ High-throughput Embedding Pipeline** - Optimized batched processing for maximum efficiency +- **🎯 Two-level Search** - Novel coarse-to-fine search overlap for accelerated query processing (optional) +- **πŸ’Ύ Memory-mapped Indices** - Fast startup with raw text mapping to reduce memory overhead +- **πŸš€ MLX Support** - Ultra-fast recompute/build with quantized embedding models, accelerating building and search ([minimal example](test/build_mlx_index.py)) + +## 🎨 Developer Experience + +- **Simple Python API** - Get started in minutes +- **Extensible backend system** - Easy to add new algorithms +- **Comprehensive examples** - From basic usage to production deployment \ No newline at end of file diff --git a/docs/roadmap.md b/docs/roadmap.md new file mode 100644 index 0000000..ac6a839 --- /dev/null +++ b/docs/roadmap.md @@ -0,0 +1,21 @@ +# πŸ“ˆ Roadmap + +## 🎯 Q2 2025 + +- [X] DiskANN backend with MIPS/L2/Cosine support +- [X] HNSW backend integration +- [X] Real-time embedding pipeline +- [X] Memory-efficient graph pruning + +## πŸš€ Q3 2025 + +- [ ] Advanced caching strategies +- [ ] Add contextual-retrieval https://www.anthropic.com/news/contextual-retrieval +- [ ] Add sleep-time-compute and summarize agent! to summarilze the file on computer! +- [ ] Add OpenAI recompute API + +## 🌟 Q4 2025 + +- [ ] Integration with LangChain/LlamaIndex +- [ ] Visual similarity search +- [ ] Query rewrtiting, rerank and expansion \ No newline at end of file diff --git a/examples/compare_faiss_vs_leann.py b/examples/compare_faiss_vs_leann.py index 2a2a55a..ea0ef3e 100644 --- a/examples/compare_faiss_vs_leann.py +++ b/examples/compare_faiss_vs_leann.py @@ -135,6 +135,7 @@ def test_leann_hnsw(): nodes = node_parser.get_nodes_from_documents([doc]) for node in nodes: all_texts.append(node.get_content()) + print(f"Total number of chunks: {len(all_texts)}") tracker.checkpoint("After text chunking")