diff --git a/README.md b/README.md index def003d..26eac8f 100755 --- a/README.md +++ b/README.md @@ -6,6 +6,7 @@ Python 3.9+ MIT License Platform + MCP Integration

@@ -16,7 +17,10 @@ LEANN is an innovative vector database that democratizes personal AI. Transform LEANN achieves this through *graph-based selective recomputation* with *high-degree preserving pruning*, computing embeddings on-demand instead of storing them all. [Illustration Fig →](#️-architecture--how-it-works) | [Paper →](https://arxiv.org/abs/2506.08276) -**Ready to RAG Everything?** Transform your laptop into a personal AI assistant that can search your **[codebase](#-claude-code-integration-transform-your-development-workflow)**, **[file system](#-personal-data-manager-process-any-documents-pdf-txt-md)**, **[emails](#-your-personal-email-secretary-rag-on-apple-mail)**, **[browser history](#-time-machine-for-the-web-rag-your-entire-browser-history)**, **[chat history](#-wechat-detective-unlock-your-golden-memories)**, or external knowledge bases (i.e., 60M documents) - all on your laptop, with zero cloud costs and complete privacy. +**Ready to RAG Everything?** Transform your laptop into a personal AI assistant that can semantic search your **[file system](#-personal-data-manager-process-any-documents-pdf-txt-md)**, **[emails](#-your-personal-email-secretary-rag-on-apple-mail)**, **[browser history](#-time-machine-for-the-web-rag-your-entire-browser-history)**, **[chat history](#-wechat-detective-unlock-your-golden-memories)**, **[codebase](#-claude-code-integration-transform-your-development-workflow)**\* , or external knowledge bases (i.e., 60M documents) - all on your laptop, with zero cloud costs and complete privacy. + + +\* Claude Code only supports basic `grep`-style keyword search. **LEANN** is a drop-in **semantic search MCP service fully compatible with Claude Code**, unlocking intelligent retrieval without changing your workflow. @@ -26,7 +30,7 @@ LEANN achieves this through *graph-based selective recomputation* with *high-deg LEANN vs Traditional Vector DB Storage Comparison

-> **The numbers speak for themselves:** Index 60 million Wikipedia chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-comparison) +> **The numbers speak for themselves:** Index 60 million text chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-comparison) 🔒 **Privacy:** Your data never leaves your laptop. No OpenAI, no cloud, no "terms of service". @@ -211,30 +215,6 @@ All RAG examples share these common parameters. **Interactive mode** is availabl -### 🚀 Claude Code Integration: Transform Your Development Workflow! - -**The future of code assistance is here.** Transform your development workflow with LEANN's native MCP integration for Claude Code. Index your entire codebase and get intelligent code assistance directly in your IDE. - -

- MCP Integration - Twitter -

- -**Key features:** -- 🔍 **Semantic code search** across your entire project -- 📚 **Context-aware assistance** for debugging and development -- 🚀 **Zero-config setup** with automatic language detection -- 🔒 **Complete privacy** - your code never leaves your machine - -```bash -# Install LEANN globally for MCP integration -uv tool install leann-core - -# Setup is automatic - just start using Claude Code! -``` - -**Ready to supercharge your coding?** [Complete Setup Guide →](packages/leann-mcp/README.md) - ### 📄 Personal Data Manager: Process Any Documents (`.pdf`, `.txt`, `.md`)! Ask questions directly about your personal PDFs, documents, and any directory containing your files! @@ -243,7 +223,7 @@ Ask questions directly about your personal PDFs, documents, and any directory co LEANN Document Search Demo

-The example below asks a question about summarizing our paper (uses default data in `data/`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a README in Chinese) and this is the **easiest example** to run here: +The example below asks a question about summarizing our paper (uses default data in `data/`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a Technical report about LLM in Huawei in Chinese), and this is the **easiest example** to run here: ```bash source .venv/bin/activate # Don't forget to activate the virtual environment @@ -438,6 +418,26 @@ Once the index is built, you can ask questions like: +### 🚀 Claude Code Integration: Transform Your Development Workflow! + +**The future of code assistance is here.** Transform your development workflow with LEANN's native MCP integration for Claude Code. Index your entire codebase and get intelligent code assistance directly in your IDE. + +**Key features:** +- 🔍 **Semantic code search** across your entire project +- 📚 **Context-aware assistance** for debugging and development +- 🚀 **Zero-config setup** with automatic language detection + +```bash +# Install LEANN globally for MCP integration +uv tool install leann-core + +# Setup is automatic - just start using Claude Code! +``` +Try our fully agentic pipeline with auto query rewriting, semantic search planning, and more: + +![LEANN MCP Integration](assets/mcp_leann.png) + +**Ready to supercharge your coding?** [Complete Setup Guide →](packages/leann-mcp/README.md) ## 🖥️ Command Line Interface @@ -467,11 +467,8 @@ leann --help ### Usage Examples ```bash -# Build an index from current directory (default) -leann build my-docs - -# Or from specific directory -leann build my-docs --docs ./documents +# build from a specific directory, and my_docs is the index name +leann build my-docs --docs ./your_documents # Search your documents leann search my-docs "machine learning concepts" diff --git a/assets/mcp_leann.png b/assets/mcp_leann.png new file mode 100644 index 0000000..de5ed04 Binary files /dev/null and b/assets/mcp_leann.png differ diff --git a/packages/leann-backend-diskann/pyproject.toml b/packages/leann-backend-diskann/pyproject.toml index b0a168d..5519ac2 100644 --- a/packages/leann-backend-diskann/pyproject.toml +++ b/packages/leann-backend-diskann/pyproject.toml @@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build" [project] name = "leann-backend-diskann" -version = "0.2.2" -dependencies = ["leann-core==0.2.2", "numpy", "protobuf>=3.19.0"] +version = "0.2.5" +dependencies = ["leann-core==0.2.5", "numpy", "protobuf>=3.19.0"] [tool.scikit-build] # Key: simplified CMake path diff --git a/packages/leann-backend-hnsw/pyproject.toml b/packages/leann-backend-hnsw/pyproject.toml index 3518cd2..89e63eb 100644 --- a/packages/leann-backend-hnsw/pyproject.toml +++ b/packages/leann-backend-hnsw/pyproject.toml @@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build" [project] name = "leann-backend-hnsw" -version = "0.2.2" +version = "0.2.5" description = "Custom-built HNSW (Faiss) backend for the Leann toolkit." dependencies = [ - "leann-core==0.2.2", + "leann-core==0.2.5", "numpy", "pyzmq>=23.0.0", "msgpack>=1.0.0", diff --git a/packages/leann-core/pyproject.toml b/packages/leann-core/pyproject.toml index c8f59b0..7e564f4 100644 --- a/packages/leann-core/pyproject.toml +++ b/packages/leann-core/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "leann-core" -version = "0.2.2" +version = "0.2.5" description = "Core API and plugin system for LEANN" readme = "README.md" requires-python = ">=3.9" diff --git a/packages/leann-core/src/leann/chat.py b/packages/leann-core/src/leann/chat.py index 541da07..665e1bd 100644 --- a/packages/leann-core/src/leann/chat.py +++ b/packages/leann-core/src/leann/chat.py @@ -17,12 +17,12 @@ logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) -def check_ollama_models() -> list[str]: +def check_ollama_models(host: str) -> list[str]: """Check available Ollama models and return a list""" try: import requests - response = requests.get("http://localhost:11434/api/tags", timeout=5) + response = requests.get(f"{host}/api/tags", timeout=5) if response.status_code == 200: data = response.json() return [model["name"] for model in data.get("models", [])] @@ -309,10 +309,12 @@ def search_hf_models(query: str, limit: int = 10) -> list[str]: return search_hf_models_fuzzy(query, limit) -def validate_model_and_suggest(model_name: str, llm_type: str) -> Optional[str]: +def validate_model_and_suggest( + model_name: str, llm_type: str, host: str = "http://localhost:11434" +) -> Optional[str]: """Validate model name and provide suggestions if invalid""" if llm_type == "ollama": - available_models = check_ollama_models() + available_models = check_ollama_models(host) if available_models and model_name not in available_models: error_msg = f"Model '{model_name}' not found in your local Ollama installation." @@ -469,7 +471,7 @@ class OllamaChat(LLMInterface): requests.get(host) # Pre-check model availability with helpful suggestions - model_error = validate_model_and_suggest(model, "ollama") + model_error = validate_model_and_suggest(model, "ollama", host) if model_error: raise ValueError(model_error) diff --git a/packages/leann-mcp/README.md b/packages/leann-mcp/README.md index 0893ad3..d5fa99f 100644 --- a/packages/leann-mcp/README.md +++ b/packages/leann-mcp/README.md @@ -30,7 +30,7 @@ claude mcp add leann-server -- leann_mcp ```bash # Build an index for your project -leann build my-project +leann build my-project --docs ./ #change to your doc PATH # Start Claude Code claude diff --git a/packages/leann/pyproject.toml b/packages/leann/pyproject.toml index eeb4050..17b50d8 100644 --- a/packages/leann/pyproject.toml +++ b/packages/leann/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "leann" -version = "0.2.2" +version = "0.2.5" description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!" readme = "README.md" requires-python = ">=3.9"