diff --git a/README.md b/README.md index 13bae5d..c2af034 100755 --- a/README.md +++ b/README.md @@ -6,6 +6,7 @@ Python 3.9+ MIT License Platform + MCP Integration

@@ -16,9 +17,10 @@ LEANN is an innovative vector database that democratizes personal AI. Transform LEANN achieves this through *graph-based selective recomputation* with *high-degree preserving pruning*, computing embeddings on-demand instead of storing them all. [Illustration Fig →](#️-architecture--how-it-works) | [Paper →](https://arxiv.org/abs/2506.08276) -**Ready to RAG Everything?** Transform your laptop into a personal AI assistant that can search your **[file system](#-personal-data-manager-process-any-documents-pdf-txt-md)**, **[emails](#-your-personal-email-secretary-rag-on-apple-mail)**, **[browser history](#-time-machine-for-the-web-rag-your-entire-browser-history)**, **[chat history](#-wechat-detective-unlock-your-golden-memories)**, or external knowledge bases (i.e., 60M documents) - all on your laptop, with zero cloud costs and complete privacy. +**Ready to RAG Everything?** Transform your laptop into a personal AI assistant that can semantic search your **[file system](#-personal-data-manager-process-any-documents-pdf-txt-md)**, **[emails](#-your-personal-email-secretary-rag-on-apple-mail)**, **[browser history](#-time-machine-for-the-web-rag-your-entire-browser-history)**, **[chat history](#-wechat-detective-unlock-your-golden-memories)**, **[codebase](#-claude-code-integration-transform-your-development-workflow)**\* , or external knowledge bases (i.e., 60M documents) - all on your laptop, with zero cloud costs and complete privacy. -> **🚀 Claude Code Integration!** LEANN now provides native MCP integration for Claude Code users. Index your codebase and get intelligent code assistance directly in Claude Code. [Setup Guide →](packages/leann-mcp/README.md) + +\* Claude Code only supports basic `grep`-style keyword search. **LEANN** is a drop-in **semantic search MCP service fully compatible with Claude Code**, unlocking intelligent retrieval without changing your workflow. 🔥 Check out [the easy setup →](packages/leann-mcp/README.md) @@ -28,7 +30,7 @@ LEANN achieves this through *graph-based selective recomputation* with *high-deg LEANN vs Traditional Vector DB Storage Comparison

-> **The numbers speak for themselves:** Index 60 million Wikipedia chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-comparison) +> **The numbers speak for themselves:** Index 60 million text chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-comparison) 🔒 **Privacy:** Your data never leaves your laptop. No OpenAI, no cloud, no "terms of service". @@ -95,7 +97,6 @@ uv sync - ## Quick Start Our declarative API makes RAG as easy as writing a config file. @@ -187,8 +188,8 @@ All RAG examples share these common parameters. **Interactive mode** is availabl --force-rebuild # Force rebuild index even if it exists # Embedding Parameters ---embedding-model MODEL # e.g., facebook/contriever, text-embedding-3-small or mlx-community/multilingual-e5-base-mlx ---embedding-mode MODE # sentence-transformers, openai, or mlx +--embedding-model MODEL # e.g., facebook/contriever, text-embedding-3-small, nomic-embed-text, or mlx-community/multilingual-e5-base-mlx +--embedding-mode MODE # sentence-transformers, openai, mlx, or ollama # LLM Parameters (Text generation models) --llm TYPE # LLM backend: openai, ollama, or hf (default: openai) @@ -221,7 +222,7 @@ Ask questions directly about your personal PDFs, documents, and any directory co LEANN Document Search Demo

-The example below asks a question about summarizing our paper (uses default data in `data/`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a README in Chinese) and this is the **easiest example** to run here: +The example below asks a question about summarizing our paper (uses default data in `data/`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a Technical report about LLM in Huawei in Chinese), and this is the **easiest example** to run here: ```bash source .venv/bin/activate # Don't forget to activate the virtual environment @@ -416,7 +417,26 @@ Once the index is built, you can ask questions like: +### 🚀 Claude Code Integration: Transform Your Development Workflow! +**The future of code assistance is here.** Transform your development workflow with LEANN's native MCP integration for Claude Code. Index your entire codebase and get intelligent code assistance directly in your IDE. + +**Key features:** +- 🔍 **Semantic code search** across your entire project +- 📚 **Context-aware assistance** for debugging and development +- 🚀 **Zero-config setup** with automatic language detection + +```bash +# Install LEANN globally for MCP integration +uv tool install leann-core + +# Setup is automatic - just start using Claude Code! +``` +Try our fully agentic pipeline with auto query rewriting, semantic search planning, and more: + +![LEANN MCP Integration](assets/mcp_leann.png) + +**Ready to supercharge your coding?** [Complete Setup Guide →](packages/leann-mcp/README.md) ## 🖥️ Command Line Interface @@ -446,11 +466,8 @@ leann --help ### Usage Examples ```bash -# Build an index from current directory (default) -leann build my-docs - -# Or from specific directory -leann build my-docs --docs ./documents +# build from a specific directory, and my_docs is the index name +leann build my-docs --docs ./your_documents # Search your documents leann search my-docs "machine learning concepts" diff --git a/apps/base_rag_example.py b/apps/base_rag_example.py index f5a481c..4bd62b9 100644 --- a/apps/base_rag_example.py +++ b/apps/base_rag_example.py @@ -75,7 +75,7 @@ class BaseRAGExample(ABC): "--embedding-mode", type=str, default="sentence-transformers", - choices=["sentence-transformers", "openai", "mlx"], + choices=["sentence-transformers", "openai", "mlx", "ollama"], help="Embedding backend mode (default: sentence-transformers)", ) @@ -85,7 +85,7 @@ class BaseRAGExample(ABC): "--llm", type=str, default="openai", - choices=["openai", "ollama", "hf"], + choices=["openai", "ollama", "hf", "simulated"], help="LLM backend to use (default: openai)", ) llm_group.add_argument( diff --git a/assets/mcp_leann.png b/assets/mcp_leann.png new file mode 100644 index 0000000..de5ed04 Binary files /dev/null and b/assets/mcp_leann.png differ diff --git a/docs/claude-code-integration.md b/docs/claude-code-integration.md deleted file mode 100644 index e19adfb..0000000 --- a/docs/claude-code-integration.md +++ /dev/null @@ -1,150 +0,0 @@ -# Claude Code x LEANN 集成指南 - -## ✅ 现状:已经可以工作! - -好消息:LEANN CLI已经完全可以在Claude Code中使用,无需任何修改! - -## 🚀 立即开始 - -### 1. 激活环境 -```bash -# 在LEANN项目目录下 -source .venv/bin/activate.fish # fish shell -# 或 -source .venv/bin/activate # bash shell -``` - -### 2. 基本命令 - -#### 查看现有索引 -```bash -leann list -``` - -#### 搜索文档 -```bash -leann search my-docs "machine learning" --recompute-embeddings -``` - -#### 问答对话 -```bash -echo "What is machine learning?" | leann ask my-docs --llm ollama --model qwen3:8b --recompute-embeddings -``` - -#### 构建新索引 -```bash -leann build project-docs --docs ./src --recompute-embeddings -``` - -## 💡 Claude Code 使用技巧 - -### 在Claude Code中直接使用 - -1. **激活环境**: - ```bash - cd /Users/andyl/Projects/LEANN-RAG - source .venv/bin/activate.fish - ``` - -2. **搜索代码库**: - ```bash - leann search my-docs "authentication patterns" --recompute-embeddings --top-k 10 - ``` - -3. **智能问答**: - ```bash - echo "How does the authentication system work?" | leann ask my-docs --llm ollama --model qwen3:8b --recompute-embeddings - ``` - -### 批量操作示例 - -```bash -# 构建项目文档索引 -leann build project-docs --docs ./docs --force - -# 搜索多个关键词 -leann search project-docs "API authentication" --recompute-embeddings -leann search project-docs "database schema" --recompute-embeddings -leann search project-docs "deployment guide" --recompute-embeddings - -# 问答模式 -echo "What are the API endpoints?" | leann ask project-docs --recompute-embeddings -``` - -## 🎯 Claude 可以立即执行的工作流 - -### 代码分析工作流 -```bash -# 1. 构建代码库索引 -leann build codebase --docs ./src --backend hnsw --recompute-embeddings - -# 2. 分析架构 -echo "What is the overall architecture?" | leann ask codebase --recompute-embeddings - -# 3. 查找特定功能 -leann search codebase "user authentication" --recompute-embeddings --top-k 5 - -# 4. 理解实现细节 -echo "How is user authentication implemented?" | leann ask codebase --recompute-embeddings -``` - -### 文档理解工作流 -```bash -# 1. 索引项目文档 -leann build docs --docs ./docs --recompute-embeddings - -# 2. 快速查找信息 -leann search docs "installation requirements" --recompute-embeddings - -# 3. 获取详细说明 -echo "What are the system requirements?" | leann ask docs --recompute-embeddings -``` - -## ⚠️ 重要提示 - -1. **必须使用 `--recompute-embeddings`** - 这是关键参数,不加会报错 -2. **需要先激活虚拟环境** - 确保有LEANN的Python环境 -3. **Ollama需要预先安装** - ask功能需要本地LLM - -## 🔥 立即可用的Claude提示词 - -``` -Help me analyze this codebase using LEANN: - -1. First, activate the environment: - cd /Users/andyl/Projects/LEANN-RAG && source .venv/bin/activate.fish - -2. Build an index of the source code: - leann build codebase --docs ./src --recompute-embeddings - -3. Search for authentication patterns: - leann search codebase "authentication middleware" --recompute-embeddings --top-k 10 - -4. Ask about the authentication system: - echo "How does user authentication work in this codebase?" | leann ask codebase --recompute-embeddings - -Please execute these commands and help me understand the code structure. -``` - -## 📈 下一步改进计划 - -虽然现在已经可以用,但还可以进一步优化: - -1. **简化命令** - 默认启用recompute-embeddings -2. **配置文件** - 避免重复输入参数 -3. **状态管理** - 自动检测环境和索引 -4. **输出格式** - 更适合Claude解析的格式 - -但这些都是锦上添花,现在就能用起来! - -## 🎉 总结 - -**LEANN现在就可以在Claude Code中完美工作!** - -- ✅ 搜索功能正常 -- ✅ RAG问答功能正常 -- ✅ 索引构建功能正常 -- ✅ 支持多种数据源 -- ✅ 支持本地LLM - -只需要记住加上 `--recompute-embeddings` 参数就行! diff --git a/docs/configuration-guide.md b/docs/configuration-guide.md index 7c6d663..28aa202 100644 --- a/docs/configuration-guide.md +++ b/docs/configuration-guide.md @@ -49,14 +49,25 @@ Based on our experience developing LEANN, embedding models fall into three categ - **Cons**: Slower inference, longer index build times - **Use when**: Quality is paramount and you have sufficient compute resources. **Highly recommended** for production use -### Quick Start: OpenAI Embeddings (Fastest Setup) +### Quick Start: Cloud and Local Embedding Options +**OpenAI Embeddings (Fastest Setup)** For immediate testing without local model downloads: ```bash # Set OpenAI embeddings (requires OPENAI_API_KEY) --embedding-mode openai --embedding-model text-embedding-3-small ``` +**Ollama Embeddings (Privacy-Focused)** +For local embeddings with complete privacy: +```bash +# First, pull an embedding model +ollama pull nomic-embed-text + +# Use Ollama embeddings +--embedding-mode ollama --embedding-model nomic-embed-text +``` +
Cloud vs Local Trade-offs diff --git a/packages/leann-backend-diskann/leann_backend_diskann/diskann_embedding_server.py b/packages/leann-backend-diskann/leann_backend_diskann/diskann_embedding_server.py index b566ae6..456689d 100644 --- a/packages/leann-backend-diskann/leann_backend_diskann/diskann_embedding_server.py +++ b/packages/leann-backend-diskann/leann_backend_diskann/diskann_embedding_server.py @@ -263,7 +263,7 @@ if __name__ == "__main__": "--embedding-mode", type=str, default="sentence-transformers", - choices=["sentence-transformers", "openai", "mlx"], + choices=["sentence-transformers", "openai", "mlx", "ollama"], help="Embedding backend mode", ) parser.add_argument( diff --git a/packages/leann-backend-diskann/pyproject.toml b/packages/leann-backend-diskann/pyproject.toml index 48b2134..5519ac2 100644 --- a/packages/leann-backend-diskann/pyproject.toml +++ b/packages/leann-backend-diskann/pyproject.toml @@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build" [project] name = "leann-backend-diskann" -version = "0.2.1" -dependencies = ["leann-core==0.2.1", "numpy", "protobuf>=3.19.0"] +version = "0.2.5" +dependencies = ["leann-core==0.2.5", "numpy", "protobuf>=3.19.0"] [tool.scikit-build] # Key: simplified CMake path diff --git a/packages/leann-backend-hnsw/leann_backend_hnsw/hnsw_embedding_server.py b/packages/leann-backend-hnsw/leann_backend_hnsw/hnsw_embedding_server.py index bf36883..f26a050 100644 --- a/packages/leann-backend-hnsw/leann_backend_hnsw/hnsw_embedding_server.py +++ b/packages/leann-backend-hnsw/leann_backend_hnsw/hnsw_embedding_server.py @@ -285,7 +285,7 @@ if __name__ == "__main__": "--embedding-mode", type=str, default="sentence-transformers", - choices=["sentence-transformers", "openai", "mlx"], + choices=["sentence-transformers", "openai", "mlx", "ollama"], help="Embedding backend mode", ) diff --git a/packages/leann-backend-hnsw/pyproject.toml b/packages/leann-backend-hnsw/pyproject.toml index f2b4b5c..89e63eb 100644 --- a/packages/leann-backend-hnsw/pyproject.toml +++ b/packages/leann-backend-hnsw/pyproject.toml @@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build" [project] name = "leann-backend-hnsw" -version = "0.2.1" +version = "0.2.5" description = "Custom-built HNSW (Faiss) backend for the Leann toolkit." dependencies = [ - "leann-core==0.2.1", + "leann-core==0.2.5", "numpy", "pyzmq>=23.0.0", "msgpack>=1.0.0", diff --git a/packages/leann-core/pyproject.toml b/packages/leann-core/pyproject.toml index e7d178d..7e564f4 100644 --- a/packages/leann-core/pyproject.toml +++ b/packages/leann-core/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "leann-core" -version = "0.2.1" +version = "0.2.5" description = "Core API and plugin system for LEANN" readme = "README.md" requires-python = ">=3.9" diff --git a/packages/leann-core/src/leann/chat.py b/packages/leann-core/src/leann/chat.py index 541da07..4200e8e 100644 --- a/packages/leann-core/src/leann/chat.py +++ b/packages/leann-core/src/leann/chat.py @@ -17,12 +17,12 @@ logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) -def check_ollama_models() -> list[str]: +def check_ollama_models(host: str) -> list[str]: """Check available Ollama models and return a list""" try: import requests - response = requests.get("http://localhost:11434/api/tags", timeout=5) + response = requests.get(f"{host}/api/tags", timeout=5) if response.status_code == 200: data = response.json() return [model["name"] for model in data.get("models", [])] @@ -309,10 +309,12 @@ def search_hf_models(query: str, limit: int = 10) -> list[str]: return search_hf_models_fuzzy(query, limit) -def validate_model_and_suggest(model_name: str, llm_type: str) -> Optional[str]: +def validate_model_and_suggest( + model_name: str, llm_type: str, host: str = "http://localhost:11434" +) -> str | None: """Validate model name and provide suggestions if invalid""" if llm_type == "ollama": - available_models = check_ollama_models() + available_models = check_ollama_models(host) if available_models and model_name not in available_models: error_msg = f"Model '{model_name}' not found in your local Ollama installation." @@ -469,7 +471,7 @@ class OllamaChat(LLMInterface): requests.get(host) # Pre-check model availability with helpful suggestions - model_error = validate_model_and_suggest(model, "ollama") + model_error = validate_model_and_suggest(model, "ollama", host) if model_error: raise ValueError(model_error) diff --git a/packages/leann-core/src/leann/cli.py b/packages/leann-core/src/leann/cli.py index 489c5d1..f307204 100644 --- a/packages/leann-core/src/leann/cli.py +++ b/packages/leann-core/src/leann/cli.py @@ -74,10 +74,11 @@ class LeannCLI: formatter_class=argparse.RawDescriptionHelpFormatter, epilog=""" Examples: - leann build my-docs --docs ./documents # Build index named my-docs - leann search my-docs "query" # Search in my-docs index - leann ask my-docs "question" # Ask my-docs index - leann list # List all stored indexes + leann build my-docs --docs ./documents # Build index named my-docs + leann build my-ppts --docs ./ --file-types .pptx,.pdf # Index only PowerPoint and PDF files + leann search my-docs "query" # Search in my-docs index + leann ask my-docs "question" # Ask my-docs index + leann list # List all stored indexes """, ) @@ -93,12 +94,24 @@ Examples: "--backend", type=str, default="hnsw", choices=["hnsw", "diskann"] ) build_parser.add_argument("--embedding-model", type=str, default="facebook/contriever") + build_parser.add_argument( + "--embedding-mode", + type=str, + default="sentence-transformers", + choices=["sentence-transformers", "openai", "mlx", "ollama"], + help="Embedding backend mode (default: sentence-transformers)", + ) build_parser.add_argument("--force", "-f", action="store_true", help="Force rebuild") build_parser.add_argument("--graph-degree", type=int, default=32) build_parser.add_argument("--complexity", type=int, default=64) build_parser.add_argument("--num-threads", type=int, default=1) build_parser.add_argument("--compact", action="store_true", default=True) build_parser.add_argument("--recompute", action="store_true", default=True) + build_parser.add_argument( + "--file-types", + type=str, + help="Comma-separated list of file extensions to include (e.g., '.txt,.pdf,.pptx'). If not specified, uses default supported types.", + ) # Search command search_parser = subparsers.add_parser("search", help="Search documents") @@ -108,7 +121,12 @@ Examples: search_parser.add_argument("--complexity", type=int, default=64) search_parser.add_argument("--beam-width", type=int, default=1) search_parser.add_argument("--prune-ratio", type=float, default=0.0) - search_parser.add_argument("--recompute-embeddings", action="store_true") + search_parser.add_argument( + "--recompute-embeddings", + action="store_true", + default=True, + help="Recompute embeddings (default: True)", + ) search_parser.add_argument( "--pruning-strategy", choices=["global", "local", "proportional"], @@ -131,7 +149,12 @@ Examples: ask_parser.add_argument("--complexity", type=int, default=32) ask_parser.add_argument("--beam-width", type=int, default=1) ask_parser.add_argument("--prune-ratio", type=float, default=0.0) - ask_parser.add_argument("--recompute-embeddings", action="store_true") + ask_parser.add_argument( + "--recompute-embeddings", + action="store_true", + default=True, + help="Recompute embeddings (default: True)", + ) ask_parser.add_argument( "--pruning-strategy", choices=["global", "local", "proportional"], @@ -254,8 +277,10 @@ Examples: print(f' leann search {example_name} "your query"') print(f" leann ask {example_name} --interactive") - def load_documents(self, docs_dir: str): + def load_documents(self, docs_dir: str, custom_file_types: str | None = None): print(f"Loading documents from {docs_dir}...") + if custom_file_types: + print(f"Using custom file types: {custom_file_types}") # Try to use better PDF parsers first documents = [] @@ -287,66 +312,81 @@ Examples: documents.extend(default_docs) # Load other file types with default reader - code_extensions = [ - # Original document types - ".txt", - ".md", - ".docx", - # Code files for Claude Code integration - ".py", - ".js", - ".ts", - ".jsx", - ".tsx", - ".java", - ".cpp", - ".c", - ".h", - ".hpp", - ".cs", - ".go", - ".rs", - ".rb", - ".php", - ".swift", - ".kt", - ".scala", - ".r", - ".sql", - ".sh", - ".bash", - ".zsh", - ".fish", - ".ps1", - ".bat", - # Config and markup files - ".json", - ".yaml", - ".yml", - ".xml", - ".toml", - ".ini", - ".cfg", - ".conf", - ".html", - ".css", - ".scss", - ".less", - ".vue", - ".svelte", - # Data science - ".ipynb", - ".R", - ".py", - ".jl", - ] - other_docs = SimpleDirectoryReader( - docs_dir, - recursive=True, - encoding="utf-8", - required_exts=code_extensions, - ).load_data(show_progress=True) - documents.extend(other_docs) + if custom_file_types: + # Parse custom file types from comma-separated string + code_extensions = [ext.strip() for ext in custom_file_types.split(",") if ext.strip()] + # Ensure extensions start with a dot + code_extensions = [ext if ext.startswith(".") else f".{ext}" for ext in code_extensions] + else: + # Use default supported file types + code_extensions = [ + # Original document types + ".txt", + ".md", + ".docx", + ".pptx", + # Code files for Claude Code integration + ".py", + ".js", + ".ts", + ".jsx", + ".tsx", + ".java", + ".cpp", + ".c", + ".h", + ".hpp", + ".cs", + ".go", + ".rs", + ".rb", + ".php", + ".swift", + ".kt", + ".scala", + ".r", + ".sql", + ".sh", + ".bash", + ".zsh", + ".fish", + ".ps1", + ".bat", + # Config and markup files + ".json", + ".yaml", + ".yml", + ".xml", + ".toml", + ".ini", + ".cfg", + ".conf", + ".html", + ".css", + ".scss", + ".less", + ".vue", + ".svelte", + # Data science + ".ipynb", + ".R", + ".py", + ".jl", + ] + # Try to load other file types, but don't fail if none are found + try: + other_docs = SimpleDirectoryReader( + docs_dir, + recursive=True, + encoding="utf-8", + required_exts=code_extensions, + ).load_data(show_progress=True) + documents.extend(other_docs) + except ValueError as e: + if "No files found" in str(e): + print("No additional files found for other supported types.") + else: + raise e all_texts = [] @@ -424,7 +464,7 @@ Examples: print(f"Index '{index_name}' already exists. Use --force to rebuild.") return - all_texts = self.load_documents(docs_dir) + all_texts = self.load_documents(docs_dir, args.file_types) if not all_texts: print("No documents found") return @@ -436,6 +476,7 @@ Examples: builder = LeannBuilder( backend_name=args.backend, embedding_model=args.embedding_model, + embedding_mode=args.embedding_mode, graph_degree=args.graph_degree, complexity=args.complexity, is_compact=args.compact, diff --git a/packages/leann-core/src/leann/embedding_compute.py b/packages/leann-core/src/leann/embedding_compute.py index 95fa9e4..67f33d1 100644 --- a/packages/leann-core/src/leann/embedding_compute.py +++ b/packages/leann-core/src/leann/embedding_compute.py @@ -6,6 +6,7 @@ Preserves all optimization parameters to ensure performance import logging import os +from concurrent.futures import ThreadPoolExecutor, as_completed from typing import Any import numpy as np @@ -35,7 +36,7 @@ def compute_embeddings( Args: texts: List of texts to compute embeddings for model_name: Model name - mode: Computation mode ('sentence-transformers', 'openai', 'mlx') + mode: Computation mode ('sentence-transformers', 'openai', 'mlx', 'ollama') is_build: Whether this is a build operation (shows progress bar) batch_size: Batch size for processing adaptive_optimization: Whether to use adaptive optimization based on batch size @@ -55,6 +56,8 @@ def compute_embeddings( return compute_embeddings_openai(texts, model_name) elif mode == "mlx": return compute_embeddings_mlx(texts, model_name) + elif mode == "ollama": + return compute_embeddings_ollama(texts, model_name, is_build=is_build) else: raise ValueError(f"Unsupported embedding mode: {mode}") @@ -365,3 +368,262 @@ def compute_embeddings_mlx(chunks: list[str], model_name: str, batch_size: int = # Stack numpy arrays return np.stack(all_embeddings) + + +def compute_embeddings_ollama( + texts: list[str], model_name: str, is_build: bool = False, host: str = "http://localhost:11434" +) -> np.ndarray: + """ + Compute embeddings using Ollama API. + + Args: + texts: List of texts to compute embeddings for + model_name: Ollama model name (e.g., "nomic-embed-text", "mxbai-embed-large") + is_build: Whether this is a build operation (shows progress bar) + host: Ollama host URL (default: http://localhost:11434) + + Returns: + Normalized embeddings array, shape: (len(texts), embedding_dim) + """ + try: + import requests + except ImportError: + raise ImportError( + "The 'requests' library is required for Ollama embeddings. Install with: uv pip install requests" + ) + + if not texts: + raise ValueError("Cannot compute embeddings for empty text list") + + logger.info( + f"Computing embeddings for {len(texts)} texts using Ollama API, model: '{model_name}'" + ) + + # Check if Ollama is running + try: + response = requests.get(f"{host}/api/version", timeout=5) + response.raise_for_status() + except requests.exceptions.ConnectionError: + error_msg = ( + f"❌ Could not connect to Ollama at {host}.\n\n" + "Please ensure Ollama is running:\n" + " • macOS/Linux: ollama serve\n" + " • Windows: Make sure Ollama is running in the system tray\n\n" + "Installation: https://ollama.com/download" + ) + raise RuntimeError(error_msg) + except Exception as e: + raise RuntimeError(f"Unexpected error connecting to Ollama: {e}") + + # Check if model exists and provide helpful suggestions + try: + response = requests.get(f"{host}/api/tags", timeout=5) + response.raise_for_status() + models = response.json() + model_names = [model["name"] for model in models.get("models", [])] + + # Filter for embedding models (models that support embeddings) + embedding_models = [] + suggested_embedding_models = [ + "nomic-embed-text", + "mxbai-embed-large", + "bge-m3", + "all-minilm", + "snowflake-arctic-embed", + ] + + for model in model_names: + # Check if it's an embedding model (by name patterns or known models) + base_name = model.split(":")[0] + if any(emb in base_name for emb in ["embed", "bge", "minilm", "e5"]): + embedding_models.append(model) + + # Check if model exists (handle versioned names) + model_found = any( + model_name == name.split(":")[0] or model_name == name for name in model_names + ) + + if not model_found: + error_msg = f"❌ Model '{model_name}' not found in local Ollama.\n\n" + + # Suggest pulling the model + error_msg += "📦 To install this embedding model:\n" + error_msg += f" ollama pull {model_name}\n\n" + + # Show available embedding models + if embedding_models: + error_msg += "✅ Available embedding models:\n" + for model in embedding_models[:5]: + error_msg += f" • {model}\n" + if len(embedding_models) > 5: + error_msg += f" ... and {len(embedding_models) - 5} more\n" + else: + error_msg += "💡 Popular embedding models to install:\n" + for model in suggested_embedding_models[:3]: + error_msg += f" • ollama pull {model}\n" + + error_msg += "\n📚 Browse more: https://ollama.com/library" + raise ValueError(error_msg) + + # Verify the model supports embeddings by testing it + try: + test_response = requests.post( + f"{host}/api/embeddings", json={"model": model_name, "prompt": "test"}, timeout=10 + ) + if test_response.status_code != 200: + error_msg = ( + f"⚠️ Model '{model_name}' exists but may not support embeddings.\n\n" + f"Please use an embedding model like:\n" + ) + for model in suggested_embedding_models[:3]: + error_msg += f" • {model}\n" + raise ValueError(error_msg) + except requests.exceptions.RequestException: + # If test fails, continue anyway - model might still work + pass + + except requests.exceptions.RequestException as e: + logger.warning(f"Could not verify model existence: {e}") + + # Process embeddings with optimized concurrent processing + import requests + + def get_single_embedding(text_idx_tuple): + """Helper function to get embedding for a single text.""" + text, idx = text_idx_tuple + max_retries = 3 + retry_count = 0 + + # Truncate very long texts to avoid API issues + truncated_text = text[:8000] if len(text) > 8000 else text + + while retry_count < max_retries: + try: + response = requests.post( + f"{host}/api/embeddings", + json={"model": model_name, "prompt": truncated_text}, + timeout=30, + ) + response.raise_for_status() + + result = response.json() + embedding = result.get("embedding") + + if embedding is None: + raise ValueError(f"No embedding returned for text {idx}") + + return idx, embedding + + except requests.exceptions.Timeout: + retry_count += 1 + if retry_count >= max_retries: + logger.warning(f"Timeout for text {idx} after {max_retries} retries") + return idx, None + + except Exception as e: + if retry_count >= max_retries - 1: + logger.error(f"Failed to get embedding for text {idx}: {e}") + return idx, None + retry_count += 1 + + return idx, None + + # Determine if we should use concurrent processing + use_concurrent = ( + len(texts) > 5 and not is_build + ) # Don't use concurrent in build mode to avoid overwhelming + max_workers = min(4, len(texts)) # Limit concurrent requests to avoid overwhelming Ollama + + all_embeddings = [None] * len(texts) # Pre-allocate list to maintain order + failed_indices = [] + + if use_concurrent: + logger.info( + f"Using concurrent processing with {max_workers} workers for {len(texts)} texts" + ) + + with ThreadPoolExecutor(max_workers=max_workers) as executor: + # Submit all tasks + future_to_idx = { + executor.submit(get_single_embedding, (text, idx)): idx + for idx, text in enumerate(texts) + } + + # Add progress bar for concurrent processing + try: + if is_build or len(texts) > 10: + from tqdm import tqdm + + futures_iterator = tqdm( + as_completed(future_to_idx), + total=len(texts), + desc="Computing Ollama embeddings", + ) + else: + futures_iterator = as_completed(future_to_idx) + except ImportError: + futures_iterator = as_completed(future_to_idx) + + # Collect results as they complete + for future in futures_iterator: + try: + idx, embedding = future.result() + if embedding is not None: + all_embeddings[idx] = embedding + else: + failed_indices.append(idx) + except Exception as e: + idx = future_to_idx[future] + logger.error(f"Exception for text {idx}: {e}") + failed_indices.append(idx) + + else: + # Sequential processing with progress bar + show_progress = is_build or len(texts) > 10 + + try: + if show_progress: + from tqdm import tqdm + + iterator = tqdm( + enumerate(texts), total=len(texts), desc="Computing Ollama embeddings" + ) + else: + iterator = enumerate(texts) + except ImportError: + iterator = enumerate(texts) + + for idx, text in iterator: + result_idx, embedding = get_single_embedding((text, idx)) + if embedding is not None: + all_embeddings[idx] = embedding + else: + failed_indices.append(idx) + + # Handle failed embeddings + if failed_indices: + if len(failed_indices) == len(texts): + raise RuntimeError("Failed to compute any embeddings") + + logger.warning(f"Failed to compute embeddings for {len(failed_indices)}/{len(texts)} texts") + + # Use zero embeddings as fallback for failed ones + valid_embedding = next((e for e in all_embeddings if e is not None), None) + if valid_embedding: + embedding_dim = len(valid_embedding) + for idx in failed_indices: + all_embeddings[idx] = [0.0] * embedding_dim + + # Remove None values and convert to numpy array + all_embeddings = [e for e in all_embeddings if e is not None] + + # Convert to numpy array and normalize + embeddings = np.array(all_embeddings, dtype=np.float32) + + # Normalize embeddings (L2 normalization) + norms = np.linalg.norm(embeddings, axis=1, keepdims=True) + embeddings = embeddings / (norms + 1e-8) # Add small epsilon to avoid division by zero + + logger.info(f"Generated {len(embeddings)} embeddings, dimension: {embeddings.shape[1]}") + + return embeddings diff --git a/packages/leann-core/src/leann/mcp.py b/packages/leann-core/src/leann/mcp.py index 6de6750..f5a2cae 100755 --- a/packages/leann-core/src/leann/mcp.py +++ b/packages/leann-core/src/leann/mcp.py @@ -1,7 +1,6 @@ #!/usr/bin/env python3 import json -import os import subprocess import sys @@ -62,10 +61,6 @@ def handle_request(request): tool_name = request["params"]["name"] args = request["params"].get("arguments", {}) - # Set working directory and environment - env = os.environ.copy() - cwd = "/Users/andyl/Projects/LEANN-RAG" - try: if tool_name == "leann_search": cmd = [ @@ -76,18 +71,14 @@ def handle_request(request): "--recompute-embeddings", f"--top-k={args.get('top_k', 5)}", ] - result = subprocess.run(cmd, capture_output=True, text=True, cwd=cwd, env=env) + result = subprocess.run(cmd, capture_output=True, text=True) elif tool_name == "leann_ask": cmd = f'echo "{args["question"]}" | leann ask {args["index_name"]} --recompute-embeddings --llm ollama --model qwen3:8b' - result = subprocess.run( - cmd, shell=True, capture_output=True, text=True, cwd=cwd, env=env - ) + result = subprocess.run(cmd, shell=True, capture_output=True, text=True) elif tool_name == "leann_list": - result = subprocess.run( - ["leann", "list"], capture_output=True, text=True, cwd=cwd, env=env - ) + result = subprocess.run(["leann", "list"], capture_output=True, text=True) return { "jsonrpc": "2.0", diff --git a/packages/leann-mcp/README.md b/packages/leann-mcp/README.md index 0893ad3..b762ae9 100644 --- a/packages/leann-mcp/README.md +++ b/packages/leann-mcp/README.md @@ -1,18 +1,25 @@ -# LEANN Claude Code Integration +# 🔥 LEANN Claude Code Integration -Intelligent code assistance using LEANN's vector search directly in Claude Code. +Transform your development workflow with intelligent code assistance using LEANN's semantic search directly in Claude Code. ## Prerequisites -First, install LEANN CLI globally: +**Step 1:** First, complete the basic LEANN installation following the [📦 Installation guide](../../README.md#installation) in the root README: +```bash +uv venv +source .venv/bin/activate +uv pip install leann +``` + +**Step 2:** Install LEANN globally for MCP integration: ```bash uv tool install leann-core ``` This makes the `leann` command available system-wide, which `leann_mcp` requires. -## Quick Setup +## 🚀 Quick Setup Add the LEANN MCP server to Claude Code: @@ -20,23 +27,25 @@ Add the LEANN MCP server to Claude Code: claude mcp add leann-server -- leann_mcp ``` -## Available Tools +## 🛠️ Available Tools -- **`leann_list`** - List available indexes across all projects -- **`leann_search`** - Search code and documents with semantic queries -- **`leann_ask`** - Ask questions and get AI-powered answers from your codebase +Once connected, you'll have access to these powerful semantic search tools in Claude Code: -## Quick Start +- **`leann_list`** - List all available indexes across your projects +- **`leann_search`** - Perform semantic searches across code and documents +- **`leann_ask`** - Ask natural language questions and get AI-powered answers from your codebase + +## 🎯 Quick Start Example ```bash -# Build an index for your project -leann build my-project +# Build an index for your project (change to your actual path) +leann build my-project --docs ./ # Start Claude Code claude ``` -Then in Claude Code: +**Try this in Claude Code:** ``` Help me understand this codebase. List available indexes and search for authentication patterns. ``` @@ -46,24 +55,37 @@ Help me understand this codebase. List available indexes and search for authenti

-## How It Works +## 🧠 How It Works -- **`leann`** - Core CLI tool for indexing and searching (installed globally) +The integration consists of three key components working seamlessly together: + +- **`leann`** - Core CLI tool for indexing and searching (installed globally via `uv tool install`) - **`leann_mcp`** - MCP server that wraps `leann` commands for Claude Code integration -- Claude Code calls `leann_mcp`, which executes `leann` commands and returns results +- **Claude Code** - Calls `leann_mcp`, which executes `leann` commands and returns intelligent results -## File Support +## 📁 File Support -Python, JavaScript, TypeScript, Java, Go, Rust, SQL, YAML, JSON, and 30+ more file types. +LEANN understands **30+ file types** including: +- **Programming**: Python, JavaScript, TypeScript, Java, Go, Rust, C++, C# +- **Data**: SQL, YAML, JSON, CSV, XML +- **Documentation**: Markdown, TXT, PDF +- **And many more!** -## Storage +## 💾 Storage & Organization -- Project indexes in `.leann/` directory (like `.git`) -- Global project registry at `~/.leann/projects.json` -- Multi-project support built-in +- **Project indexes**: Stored in `.leann/` directory (just like `.git`) +- **Global registry**: Project tracking at `~/.leann/projects.json` +- **Multi-project support**: Switch between different codebases seamlessly +- **Portable**: Transfer indexes between machines with minimal overhead -## Removing +## 🗑️ Uninstalling + +To remove the LEANN MCP server from Claude Code: ```bash claude mcp remove leann-server ``` +To remove LEANN +``` +uv pip uninstall leann leann-backend-hnsw leann-core +``` diff --git a/packages/leann/pyproject.toml b/packages/leann/pyproject.toml index 1f3cb50..17b50d8 100644 --- a/packages/leann/pyproject.toml +++ b/packages/leann/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "leann" -version = "0.2.1" +version = "0.2.5" description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!" readme = "README.md" requires-python = ">=3.9"