docs: how it works earlier

This commit is contained in:
Andy Lee
2025-07-19 20:42:52 -07:00
parent 1f90cdfafb
commit 96f74973b1

View File

@@ -16,6 +16,7 @@ LEANN is a revolutionary vector database that makes personal AI accessible to ev
RAG your **[emails](#-search-your-entire-life)**, **[browser history](#-time-machine-for-the-web)**, **[WeChat](#-wechat-detective)**, or 60M documents on your laptop, in nearly zero cost. No cloud, no API keys, completely private.
LEANN achieves this through graph-based selective recomputation with high-degree preserving pruning and dynamic batching, computing embeddings on-demand instead of storing them all. [Read more →](#-architecture--how-it-works)
## Why LEANN?
@@ -23,7 +24,7 @@ RAG your **[emails](#-search-your-entire-life)**, **[browser history](#-time-mac
<img src="assets/effects.png" alt="LEANN vs Traditional Vector DB Storage Comparison" width="100%">
</p>
**The numbers speak for themselves:** Index 60 million Wikipedia articles in just 6GB instead of 201GB. Finally, your MacBook can handle enterprise-scale datasets. [See detailed benchmarks below ↓](#benchmarks)
**The numbers speak for themselves:** Index 60 million Wikipedia articles in just 6GB instead of 201GB. Finally, your MacBook can handle enterprise-scale datasets. [See detailed benchmarks below ↓](#storage-usage-comparison)
## Why This Matters
@@ -217,17 +218,22 @@ This demo showcases how to build a RAG system for PDF/md documents using Leann.
## How It Works
## 🏗️ Architecture & How It Works
LEANN doesn't store embeddings. Instead, it builds a lightweight graph and computes embeddings on-demand during search.
<p align="center">
<img src="assets/arch.png" alt="LEANN Architecture" width="800">
</p>
**The magic:** Most vector DBs store every single embedding (expensive). LEANN stores a pruned graph structure (cheap) and recomputes embeddings only when needed (fast).
**Core techniques:**
- **Graph-based selective recomputation:** Only compute embeddings for nodes in the search path
- **High-degree preserving pruning:** Keep important "hub" nodes while removing redundant connections
- **Dynamic batching:** Efficiently batch embedding computations for GPU utilization
- **Two-level search:** Smart graph traversal that prioritizes promising nodes
**Backends:** DiskANN or HNSW - pick what works for your data size.
**Performance:** Real-time search on millions of documents.
## Benchmarks
Run the comparison yourself:
@@ -278,13 +284,6 @@ The evaluation script downloads data automatically on first run.
*Benchmarks run on Apple M3 Pro 36 GB*
## 🏗️ Architecture
<p align="center">
<img src="assets/arch.png" alt="LEANN Architecture" width="800">
</p>
## 🔬 Paper
If you find Leann useful, please cite: