diff --git a/README.md b/README.md index d78b290..85ea74d 100755 --- a/README.md +++ b/README.md @@ -1,64 +1,62 @@ -

🚀 LEANN: A Low-Storage Vector Index

+

+ LEANN Logo +

+ +# LEANN - the smallest vector index in the world. RAGE! +## With LEANN, you can RAG Anything! + +**97% smaller than FAISS.** RAG your emails, browser history, WeChat, or 60M documents on your laptop. No cloud, no API keys, no bullshit. + +```bash +git clone https://github.com/yichuan520030910320/LEANN-RAG.git && cd LEANN-RAG +# 30 seconds later... +python demo.py # RAG your first 1M documents +```

Python 3.9+ MIT License - PRs Welcome -Platform + Platform

+## The Difference is Stunning +

- 💾 Extreme Storage Saving â€ĸ 🔒 100% Private â€ĸ 📚 RAG Everything â€ĸ ⚡ Easy & Accurate + LEANN vs Traditional Vector DB Storage Comparison

-

- Quick Start â€ĸ - Features â€ĸ - Benchmarks â€ĸ - Paper -

+**Bottom line:** Index 60 million Wikipedia articles in 6GB instead of 201GB. Your MacBook can finally handle real datasets. ---- +## Why This Matters -## 🌟 What is LEANN-RAG? +**Privacy:** Your data never leaves your laptop. No OpenAI, no cloud, no "terms of service". -**LEANN-RAG** is a lightweight, locally deployable **Retrieval-Augmented Generation (RAG)** engine designed for personal devices. It combines **compact storage**, **clean usability**, and **privacy-by-design**, making it easy to build personalized retrieval systems over your own data — emails, notes, documents, chats, or anything else. +**Speed:** Real-time search on consumer hardware. No server setup, no configuration hell. -Unlike traditional vector databases that rely on massive embedding storage, LEANN reduces storage needs dramatically by using **graph-based recomputation** and **pruned HNSW search**, while maintaining responsive and reliable performance — all without sending any data to the cloud. +**Scale:** Handle datasets that would crash traditional vector DBs on your laptop. ---- +## 30-Second Demo: RAG Your Life -## đŸ”Ĩ Key Highlights +```python +from leann.api import LeannBuilder, LeannSearcher -### 💾 1. Extreme Storage Efficiency -LEANN reduces storage usage by **up to 97%** compared to conventional vector DBs (e.g., FAISS), by storing only pruned graph structures and computing embeddings at query time. -> For example: 60M chunks can be indexed in just **6GB**, compared to **200GB+** with dense storage. +# Index your entire email history (90K emails = 14MB vs 305MB) +builder = LeannBuilder(backend_name="hnsw") +builder.add_from_mailbox("~/Library/Mail") # Your actual emails +builder.build_index("my_life.leann") -### 🔒 2. Fully Private, Cloud-Free -LEANN runs entirely locally. No cloud services, no API keys, and no risk of leaking sensitive data. -> Converse with your own files **without compromising privacy**. +# Ask questions about your own data +searcher = LeannSearcher("my_life.leann") +searcher.search("What did my boss say about the deadline?") +searcher.search("Find emails about vacation requests") +searcher.search("Show me all conversations with John about the project") +``` -### 🧠 3. RAG Everything -Build truly personalized assistants by querying over **your own** chat logs, email archives, browser history, or agent memory. -> LEANN makes it easy to integrate personal context into RAG workflows. +**That's it.** No cloud setup, no API keys, no "fine-tuning". Just your data, your questions, your laptop. -### ⚡ 4. Easy, Accurate, and Fast -LEANN is designed to be **easy to install**, with a **clean API** and minimal setup. It runs efficiently on consumer hardware without sacrificing retrieval accuracy. -> One command to install, one click to run. +[Try the interactive demo →](demo.ipynb) ---- - -## 🚀 Why Choose LEANN? - -Traditional RAG systems often require trade-offs between storage, privacy, and usability. **LEANN-RAG aims to simplify the stack** with a more practical design: - -- ✅ **No embedding storage** — compute on demand, save disk space -- ✅ **Low memory footprint** — lightweight and hardware-friendly -- ✅ **Privacy-first** — 100% local, no network dependency -- ✅ **Simple to use** — developer-friendly API and seamless setup - -> 📄 For more details, see our [academic paper](https://arxiv.org/abs/2506.08276) -## 🚀 Quick Start +## Get Started in 30 Seconds ### Installation @@ -115,24 +113,6 @@ ollama pull llama3.2:3b **Note:** For Hugging Face models >1B parameters, you may encounter OOM errors on consumer hardware. Consider using smaller models like Qwen3-0.6B or switch to Ollama for better memory management. -### 30-Second Example -Try it out in [**demo.ipynb**](demo.ipynb) - -```python -from leann.api import LeannBuilder, LeannSearcher -# 1. Build index (no embeddings stored!) -builder = LeannBuilder(backend_name="hnsw") -builder.add_text("C# is a powerful programming language") -builder.add_text("Python is a powerful programming language") -builder.add_text("Machine learning transforms industries") -builder.add_text("Neural networks process complex data") -builder.add_text("Leann is a great storage saving engine for RAG on your macbook") -builder.build_index("knowledge.leann") -# 2. Search with real-time embeddings -searcher = LeannSearcher("knowledge.leann") -results = searcher.search("C++ programming languages", top_k=2, recompute_beighbor_embeddings=True) -print(results) -``` ### Run the Demo (support .pdf,.txt,.docx, .pptx, .csv, .md etc) @@ -146,6 +126,40 @@ or you want to use python source .venv/bin/activate python ./examples/main_cli_example.py ``` +## Wild Things You Can Do + +### đŸ•ĩī¸ Search Your Entire Life +```bash +python examples/mail_reader_leann.py +# "What did my boss say about the Christmas party last year?" +# "Find all emails from my mom about birthday plans" +``` +**90K emails → 14MB.** Finally, search your email like you search Google. + +### 🌐 Time Machine for the Web +```bash +python examples/google_history_reader_leann.py +# "What was that AI paper I read last month?" +# "Show me all the cooking videos I watched" +``` +**38K browser entries → 6MB.** Your browser history becomes your personal search engine. + +### đŸ’Ŧ WeChat Detective +```bash +python examples/wechat_history_reader_leann.py +# "æˆ‘æƒŗäš°é­”æœ¯å¸ˆįēĻįŋ°é€Šįš„įƒčĄŖīŧŒį왿ˆ‘一äē›å¯šåē”čŠå¤ŠčްåŊ•" +# "Show me all group chats about weekend plans" +``` +**400K messages → 64MB.** Search years of chat history in any language. + +### 📚 Personal Wikipedia +```bash +# Index 60M Wikipedia articles in 6GB (not 201GB) +python examples/build_massive_index.py --source wikipedia +# "Explain quantum computing like I'm 5" +# "What are the connections between philosophy and AI?" +``` + **PDF RAG Demo (using LlamaIndex for document parsing and Leann for indexing/search)** This demo showcases how to build a RAG system for PDF/md documents using Leann. @@ -155,254 +169,42 @@ This demo showcases how to build a RAG system for PDF/md documents using Leann. -## ✨ Features +## How It Works -### đŸ”Ĩ Core Features +LEANN doesn't store embeddings. Instead, it builds a lightweight graph and computes embeddings on-demand during search. -- **🔄 Real-time Embeddings** - Eliminate heavy embedding storage with dynamic computation using optimized ZMQ servers and highly optimized search paradigm (overlapping and batching) with highly optimized embedding engine -- **📈 Scalable Architecture** - Handles millions of documents on consumer hardware; the larger your dataset, the more LEANN can save -- **đŸŽ¯ Graph Pruning** - Advanced techniques to minimize the storage overhead of vector search to a limited footprint -- **đŸ—ī¸ Pluggable Backends** - DiskANN, HNSW/FAISS with unified API +**The magic:** Most vector DBs store every single embedding (expensive). LEANN stores a pruned graph structure (cheap) and recomputes embeddings only when needed (fast). -### đŸ› ī¸ Technical Highlights -- **🔄 Recompute Mode** - Highest accuracy scenarios while eliminating vector storage overhead -- **⚡ Zero-copy Operations** - Minimize IPC overhead by transferring distances instead of embeddings -- **🚀 High-throughput Embedding Pipeline** - Optimized batched processing for maximum efficiency -- **đŸŽ¯ Two-level Search** - Novel coarse-to-fine search overlap for accelerated query processing (optional) -- **💾 Memory-mapped Indices** - Fast startup with raw text mapping to reduce memory overhead -- **🚀 MLX Support** - Ultra-fast recompute with quantized embedding models, accelerating building and search by 10-100x ([minimal example](test/build_mlx_index.py)) +**Backends:** DiskANN, HNSW, or FAISS - pick what works for your data size. -### 🎨 Developer Experience +**Performance:** Real-time search on millions of documents. MLX support for 10-100x faster building on Apple Silicon. -- **Simple Python API** - Get started in minutes -- **Extensible backend system** - Easy to add new algorithms -- **Comprehensive examples** - From basic usage to production deployment -## Applications on your MacBook -### 📧 Lightweight RAG on your Apple Mail - -LEANN can create a searchable index of your Apple Mail emails, allowing you to query your email history using natural language. - -#### Quick Start - -
-📋 Click to expand: Command Examples +## Benchmarks +Run the comparison yourself: ```bash -# Use default mail path (works for most macOS setups) -python examples/mail_reader_leann.py - - -# Run with custom index directory -python examples/mail_reader_leann.py --index-dir "./my_mail_index" - -# embedd and search all of your email(this may take a long preprocessing time but it will encode all your emails) -python examples/mail_reader_leann.py --max-emails -1 - -# Limit number of emails processed (useful for testing) -python examples/mail_reader_leann.py --max-emails 1000 - -# Run a single query -python examples/mail_reader_leann.py --query "Whats the number of class recommend to take per semester for incoming EECS students" -``` - -
- - - -#### Example Queries - -
-đŸ’Ŧ Click to expand: Example queries you can try - -Once the index is built, you can ask questions like: -- "Find emails from my boss about deadlines" -- "What did John say about the project timeline?" -- "Show me emails about travel expenses" - -
- -### 🌐 Lightweight RAG on your Google Chrome History - -LEANN can create a searchable index of your Chrome browser history, allowing you to query your browsing history using natural language. - -#### Quick Start - -
-📋 Click to expand: Command Examples - -Note you need to quit google right now to successfully run this. - -```bash -# Use default Chrome profile (auto-finds all profiles) and recommand method to run this because usually default file is enough -python examples/google_history_reader_leann.py - - -# Run with custom index directory -python examples/google_history_reader_leann.py --index-dir "./my_chrome_index" - -# Limit number of history entries processed (useful for testing) -python examples/google_history_reader_leann.py --max-entries 500 - -# Run a single query -python examples/google_history_reader_leann.py --query "What websites did I visit about machine learning?" - -# Use only a specific profile (disable auto-find) -python examples/google_history_reader_leann.py --chrome-profile "~/Library/Application Support/Google/Chrome/Default" --no-auto-find-profiles -``` - -
- -#### Finding Your Chrome Profile - -
-🔍 Click to expand: How to find your Chrome profile - -The default Chrome profile path is configured for a typical macOS setup. If you need to find your specific Chrome profile: - -1. Open Terminal -2. Run: `ls ~/Library/Application\ Support/Google/Chrome/` -3. Look for folders like "Default", "Profile 1", "Profile 2", etc. -4. Use the full path as your `--chrome-profile` argument - -**Common Chrome profile locations:** -- macOS: `~/Library/Application Support/Google/Chrome/Default` -- Linux: `~/.config/google-chrome/Default` - -
- -#### Example Queries - -
-đŸ’Ŧ Click to expand: Example queries you can try - -Once the index is built, you can ask questions like: -- "What websites did I visit about machine learning?" -- "Find my search history about programming" -- "What YouTube videos did I watch recently?" -- "Show me websites I visited about travel planning" - -
- - -### đŸ’Ŧ Lightweight RAG on your WeChat History - -LEANN can create a searchable index of your WeChat chat history, allowing you to query your conversations using natural language. - -#### Prerequisites - -
-🔧 Click to expand: Installation Requirements - -First, you need to install the WeChat exporter: - -```bash -sudo packages/wechat-exporter/wechattweak-cli install -``` - -**Troubleshooting**: If you encounter installation issues, check the [WeChatTweak-CLI issues page](https://github.com/sunnyyoung/WeChatTweak-CLI/issues/41). - -
- -#### Quick Start - -
-📋 Click to expand: Command Examples - -```bash -# Use default settings (recommended for first run) -python examples/wechat_history_reader_leann.py - -# Run with custom export directory and wehn we run the first time, LEANN will export all chat history automatically for you -python examples/wechat_history_reader_leann.py --export-dir "./my_wechat_exports" - -# Run with custom index directory -python examples/wechat_history_reader_leann.py --index-dir "./my_wechat_index" - -# Limit number of chat entries processed (useful for testing) -python examples/wechat_history_reader_leann.py --max-entries 1000 - -# Run a single query -python examples/wechat_history_reader_leann.py --query "Show me conversations about travel plans" - -``` - -
- -#### Example Queries - -
-đŸ’Ŧ Click to expand: Example queries you can try - -Once the index is built, you can ask questions like: -- "æˆ‘æƒŗäš°é­”æœ¯å¸ˆįēĻįŋ°é€Šįš„įƒčĄŖīŧŒį왿ˆ‘一äē›å¯šåē”čŠå¤ŠčްåŊ•?" (Chinese: Show me chat records about buying Magic Johnson's jersey) - -
- - -## ⚡ Performance Comparison - -### LEANN vs Faiss HNSW - -We benchmarked LEANN against the popular Faiss HNSW implementation to demonstrate the significant memory and storage savings our approach provides: - -```bash -# Run the comparison benchmark python examples/compare_faiss_vs_leann.py ``` -#### đŸŽ¯ Results Summary +| System | Storage | +|--------|---------| +| FAISS HNSW | 5.5 MB | +| LEANN | 0.5 MB | +| **Savings** | **91%** | -| Metric | Faiss HNSW | LEANN HNSW | **Improvement** | -|--------|------------|-------------|-----------------| -| **Storage Size** | 5.5 MB | 0.5 MB | **11.4x smaller** (5.0 MB saved) | +Same dataset, same hardware, same embedding model. LEANN just works better. -#### 📈 Key Takeaways - - -- **💾 Storage Optimization**: LEANN requires **91% less storage** for the same dataset - -- **âš–ī¸ Fair Comparison**: Both systems tested on identical hardware with the same 2,573 document dataset and the same embedding model and chunk method - -> **Note**: Results may vary based on dataset size, hardware configuration, and query patterns. The comparison excludes text storage to focus purely on index structures. - - - -*Benchmark results obtained on Apple Silicon with consistent environmental conditions* - -## 📊 Benchmarks - -### How to Reproduce Evaluation Results - -Reproducing our benchmarks is straightforward. The evaluation script is designed to be self-contained, automatically downloading all necessary data on its first run. - -#### 1. Environment Setup - -First, ensure you have followed the installation instructions in the [Quick Start](#-quick-start) section. This will install all core dependencies. - -Next, install the optional development dependencies, which include the `huggingface-hub` library required for automatic data download: +## Reproduce Our Results ```bash -# This command installs all development dependencies -uv pip install -e ".[dev]" +uv pip install -e ".[dev]" # Install dev dependencies +python examples/run_evaluation.py data/indices/dpr/dpr_diskann # DPR dataset +python examples/run_evaluation.py data/indices/rpj_wiki/rpj_wiki.index # Wikipedia ``` -#### 2. Run the Evaluation - -Simply run the evaluation script. The first time you run it, it will detect that the data is missing, download it from Hugging Face Hub, and then proceed with the evaluation. - -**To evaluate the DPR dataset:** -```bash -python examples/run_evaluation.py data/indices/dpr/dpr_diskann -``` - -**To evaluate the RPJ-Wiki dataset:** -```bash -python examples/run_evaluation.py data/indices/rpj_wiki/rpj_wiki.index -``` - -The script will print the recall and search time for each query, followed by the average results. +The evaluation script downloads data automatically on first run. ### Storage Usage Comparison @@ -433,7 +235,7 @@ The script will print the recall and search time for each query, followed by the ## đŸ—ī¸ Architecture

- LEANN Architecture + LEANN Architecture

## đŸ”Ŧ Paper diff --git a/asset/arch.png b/assets/arch.png similarity index 100% rename from asset/arch.png rename to assets/arch.png diff --git a/assets/logo.png b/assets/logo.png new file mode 100644 index 0000000..6760a66 Binary files /dev/null and b/assets/logo.png differ