Compare commits
7 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b6ab6f1993 | ||
|
|
9f2e82a838 | ||
|
|
0b2b799d5a | ||
|
|
0f790fbbd9 | ||
|
|
387ae21eba | ||
|
|
3cc329c3e7 | ||
|
|
5567302316 |
36
README.md
36
README.md
@@ -6,6 +6,7 @@
|
|||||||
<img src="https://img.shields.io/badge/Python-3.9%2B-blue.svg" alt="Python 3.9+">
|
<img src="https://img.shields.io/badge/Python-3.9%2B-blue.svg" alt="Python 3.9+">
|
||||||
<img src="https://img.shields.io/badge/License-MIT-green.svg" alt="MIT License">
|
<img src="https://img.shields.io/badge/License-MIT-green.svg" alt="MIT License">
|
||||||
<img src="https://img.shields.io/badge/Platform-Linux%20%7C%20macOS-lightgrey" alt="Platform">
|
<img src="https://img.shields.io/badge/Platform-Linux%20%7C%20macOS-lightgrey" alt="Platform">
|
||||||
|
<img src="https://img.shields.io/badge/MCP-Native%20Integration-blue?style=flat-square" alt="MCP Integration">
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h2 align="center" tabindex="-1" class="heading-element" dir="auto">
|
<h2 align="center" tabindex="-1" class="heading-element" dir="auto">
|
||||||
@@ -16,9 +17,10 @@ LEANN is an innovative vector database that democratizes personal AI. Transform
|
|||||||
|
|
||||||
LEANN achieves this through *graph-based selective recomputation* with *high-degree preserving pruning*, computing embeddings on-demand instead of storing them all. [Illustration Fig →](#️-architecture--how-it-works) | [Paper →](https://arxiv.org/abs/2506.08276)
|
LEANN achieves this through *graph-based selective recomputation* with *high-degree preserving pruning*, computing embeddings on-demand instead of storing them all. [Illustration Fig →](#️-architecture--how-it-works) | [Paper →](https://arxiv.org/abs/2506.08276)
|
||||||
|
|
||||||
**Ready to RAG Everything?** Transform your laptop into a personal AI assistant that can search your **[file system](#-personal-data-manager-process-any-documents-pdf-txt-md)**, **[emails](#-your-personal-email-secretary-rag-on-apple-mail)**, **[browser history](#-time-machine-for-the-web-rag-your-entire-browser-history)**, **[chat history](#-wechat-detective-unlock-your-golden-memories)**, or external knowledge bases (i.e., 60M documents) - all on your laptop, with zero cloud costs and complete privacy.
|
**Ready to RAG Everything?** Transform your laptop into a personal AI assistant that can semantic search your **[file system](#-personal-data-manager-process-any-documents-pdf-txt-md)**, **[emails](#-your-personal-email-secretary-rag-on-apple-mail)**, **[browser history](#-time-machine-for-the-web-rag-your-entire-browser-history)**, **[chat history](#-wechat-detective-unlock-your-golden-memories)**, **[codebase](#-claude-code-integration-transform-your-development-workflow)**\* , or external knowledge bases (i.e., 60M documents) - all on your laptop, with zero cloud costs and complete privacy.
|
||||||
|
|
||||||
> **🚀 NEW: Claude Code Integration!** LEANN now provides native MCP integration for Claude Code users. Index your codebase and get intelligent code assistance directly in Claude Code. [Setup Guide →](packages/leann-mcp/README.md)
|
|
||||||
|
\* Claude Code only supports basic `grep`-style keyword search. **LEANN** is a drop-in **semantic search MCP service fully compatible with Claude Code**, unlocking intelligent retrieval without changing your workflow.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -28,7 +30,7 @@ LEANN achieves this through *graph-based selective recomputation* with *high-deg
|
|||||||
<img src="assets/effects.png" alt="LEANN vs Traditional Vector DB Storage Comparison" width="70%">
|
<img src="assets/effects.png" alt="LEANN vs Traditional Vector DB Storage Comparison" width="70%">
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
> **The numbers speak for themselves:** Index 60 million Wikipedia chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-comparison)
|
> **The numbers speak for themselves:** Index 60 million text chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. [See detailed benchmarks for different applications below ↓](#storage-comparison)
|
||||||
|
|
||||||
|
|
||||||
🔒 **Privacy:** Your data never leaves your laptop. No OpenAI, no cloud, no "terms of service".
|
🔒 **Privacy:** Your data never leaves your laptop. No OpenAI, no cloud, no "terms of service".
|
||||||
@@ -221,7 +223,7 @@ Ask questions directly about your personal PDFs, documents, and any directory co
|
|||||||
<img src="videos/paper_clear.gif" alt="LEANN Document Search Demo" width="600">
|
<img src="videos/paper_clear.gif" alt="LEANN Document Search Demo" width="600">
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
The example below asks a question about summarizing our paper (uses default data in `data/`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a README in Chinese) and this is the **easiest example** to run here:
|
The example below asks a question about summarizing our paper (uses default data in `data/`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a Technical report about LLM in Huawei in Chinese), and this is the **easiest example** to run here:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source .venv/bin/activate # Don't forget to activate the virtual environment
|
source .venv/bin/activate # Don't forget to activate the virtual environment
|
||||||
@@ -416,7 +418,26 @@ Once the index is built, you can ask questions like:
|
|||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
### 🚀 Claude Code Integration: Transform Your Development Workflow!
|
||||||
|
|
||||||
|
**The future of code assistance is here.** Transform your development workflow with LEANN's native MCP integration for Claude Code. Index your entire codebase and get intelligent code assistance directly in your IDE.
|
||||||
|
|
||||||
|
**Key features:**
|
||||||
|
- 🔍 **Semantic code search** across your entire project
|
||||||
|
- 📚 **Context-aware assistance** for debugging and development
|
||||||
|
- 🚀 **Zero-config setup** with automatic language detection
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install LEANN globally for MCP integration
|
||||||
|
uv tool install leann-core
|
||||||
|
|
||||||
|
# Setup is automatic - just start using Claude Code!
|
||||||
|
```
|
||||||
|
Try our fully agentic pipeline with auto query rewriting, semantic search planning, and more:
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
**Ready to supercharge your coding?** [Complete Setup Guide →](packages/leann-mcp/README.md)
|
||||||
|
|
||||||
## 🖥️ Command Line Interface
|
## 🖥️ Command Line Interface
|
||||||
|
|
||||||
@@ -446,11 +467,8 @@ leann --help
|
|||||||
### Usage Examples
|
### Usage Examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Build an index from current directory (default)
|
# build from a specific directory, and my_docs is the index name
|
||||||
leann build my-docs
|
leann build my-docs --docs ./your_documents
|
||||||
|
|
||||||
# Or from specific directory
|
|
||||||
leann build my-docs --docs ./documents
|
|
||||||
|
|
||||||
# Search your documents
|
# Search your documents
|
||||||
leann search my-docs "machine learning concepts"
|
leann search my-docs "machine learning concepts"
|
||||||
|
|||||||
BIN
assets/mcp_leann.png
Normal file
BIN
assets/mcp_leann.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 224 KiB |
@@ -1,150 +0,0 @@
|
|||||||
# Claude Code x LEANN 集成指南
|
|
||||||
|
|
||||||
## ✅ 现状:已经可以工作!
|
|
||||||
|
|
||||||
好消息:LEANN CLI已经完全可以在Claude Code中使用,无需任何修改!
|
|
||||||
|
|
||||||
## 🚀 立即开始
|
|
||||||
|
|
||||||
### 1. 激活环境
|
|
||||||
```bash
|
|
||||||
# 在LEANN项目目录下
|
|
||||||
source .venv/bin/activate.fish # fish shell
|
|
||||||
# 或
|
|
||||||
source .venv/bin/activate # bash shell
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. 基本命令
|
|
||||||
|
|
||||||
#### 查看现有索引
|
|
||||||
```bash
|
|
||||||
leann list
|
|
||||||
```
|
|
||||||
|
|
||||||
#### 搜索文档
|
|
||||||
```bash
|
|
||||||
leann search my-docs "machine learning" --recompute-embeddings
|
|
||||||
```
|
|
||||||
|
|
||||||
#### 问答对话
|
|
||||||
```bash
|
|
||||||
echo "What is machine learning?" | leann ask my-docs --llm ollama --model qwen3:8b --recompute-embeddings
|
|
||||||
```
|
|
||||||
|
|
||||||
#### 构建新索引
|
|
||||||
```bash
|
|
||||||
leann build project-docs --docs ./src --recompute-embeddings
|
|
||||||
```
|
|
||||||
|
|
||||||
## 💡 Claude Code 使用技巧
|
|
||||||
|
|
||||||
### 在Claude Code中直接使用
|
|
||||||
|
|
||||||
1. **激活环境**:
|
|
||||||
```bash
|
|
||||||
cd /Users/andyl/Projects/LEANN-RAG
|
|
||||||
source .venv/bin/activate.fish
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **搜索代码库**:
|
|
||||||
```bash
|
|
||||||
leann search my-docs "authentication patterns" --recompute-embeddings --top-k 10
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **智能问答**:
|
|
||||||
```bash
|
|
||||||
echo "How does the authentication system work?" | leann ask my-docs --llm ollama --model qwen3:8b --recompute-embeddings
|
|
||||||
```
|
|
||||||
|
|
||||||
### 批量操作示例
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# 构建项目文档索引
|
|
||||||
leann build project-docs --docs ./docs --force
|
|
||||||
|
|
||||||
# 搜索多个关键词
|
|
||||||
leann search project-docs "API authentication" --recompute-embeddings
|
|
||||||
leann search project-docs "database schema" --recompute-embeddings
|
|
||||||
leann search project-docs "deployment guide" --recompute-embeddings
|
|
||||||
|
|
||||||
# 问答模式
|
|
||||||
echo "What are the API endpoints?" | leann ask project-docs --recompute-embeddings
|
|
||||||
```
|
|
||||||
|
|
||||||
## 🎯 Claude 可以立即执行的工作流
|
|
||||||
|
|
||||||
### 代码分析工作流
|
|
||||||
```bash
|
|
||||||
# 1. 构建代码库索引
|
|
||||||
leann build codebase --docs ./src --backend hnsw --recompute-embeddings
|
|
||||||
|
|
||||||
# 2. 分析架构
|
|
||||||
echo "What is the overall architecture?" | leann ask codebase --recompute-embeddings
|
|
||||||
|
|
||||||
# 3. 查找特定功能
|
|
||||||
leann search codebase "user authentication" --recompute-embeddings --top-k 5
|
|
||||||
|
|
||||||
# 4. 理解实现细节
|
|
||||||
echo "How is user authentication implemented?" | leann ask codebase --recompute-embeddings
|
|
||||||
```
|
|
||||||
|
|
||||||
### 文档理解工作流
|
|
||||||
```bash
|
|
||||||
# 1. 索引项目文档
|
|
||||||
leann build docs --docs ./docs --recompute-embeddings
|
|
||||||
|
|
||||||
# 2. 快速查找信息
|
|
||||||
leann search docs "installation requirements" --recompute-embeddings
|
|
||||||
|
|
||||||
# 3. 获取详细说明
|
|
||||||
echo "What are the system requirements?" | leann ask docs --recompute-embeddings
|
|
||||||
```
|
|
||||||
|
|
||||||
## ⚠️ 重要提示
|
|
||||||
|
|
||||||
1. **必须使用 `--recompute-embeddings`** - 这是关键参数,不加会报错
|
|
||||||
2. **需要先激活虚拟环境** - 确保有LEANN的Python环境
|
|
||||||
3. **Ollama需要预先安装** - ask功能需要本地LLM
|
|
||||||
|
|
||||||
## 🔥 立即可用的Claude提示词
|
|
||||||
|
|
||||||
```
|
|
||||||
Help me analyze this codebase using LEANN:
|
|
||||||
|
|
||||||
1. First, activate the environment:
|
|
||||||
cd /Users/andyl/Projects/LEANN-RAG && source .venv/bin/activate.fish
|
|
||||||
|
|
||||||
2. Build an index of the source code:
|
|
||||||
leann build codebase --docs ./src --recompute-embeddings
|
|
||||||
|
|
||||||
3. Search for authentication patterns:
|
|
||||||
leann search codebase "authentication middleware" --recompute-embeddings --top-k 10
|
|
||||||
|
|
||||||
4. Ask about the authentication system:
|
|
||||||
echo "How does user authentication work in this codebase?" | leann ask codebase --recompute-embeddings
|
|
||||||
|
|
||||||
Please execute these commands and help me understand the code structure.
|
|
||||||
```
|
|
||||||
|
|
||||||
## 📈 下一步改进计划
|
|
||||||
|
|
||||||
虽然现在已经可以用,但还可以进一步优化:
|
|
||||||
|
|
||||||
1. **简化命令** - 默认启用recompute-embeddings
|
|
||||||
2. **配置文件** - 避免重复输入参数
|
|
||||||
3. **状态管理** - 自动检测环境和索引
|
|
||||||
4. **输出格式** - 更适合Claude解析的格式
|
|
||||||
|
|
||||||
但这些都是锦上添花,现在就能用起来!
|
|
||||||
|
|
||||||
## 🎉 总结
|
|
||||||
|
|
||||||
**LEANN现在就可以在Claude Code中完美工作!**
|
|
||||||
|
|
||||||
- ✅ 搜索功能正常
|
|
||||||
- ✅ RAG问答功能正常
|
|
||||||
- ✅ 索引构建功能正常
|
|
||||||
- ✅ 支持多种数据源
|
|
||||||
- ✅ 支持本地LLM
|
|
||||||
|
|
||||||
只需要记住加上 `--recompute-embeddings` 参数就行!
|
|
||||||
@@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "leann-backend-diskann"
|
name = "leann-backend-diskann"
|
||||||
version = "0.2.2"
|
version = "0.2.5"
|
||||||
dependencies = ["leann-core==0.2.2", "numpy", "protobuf>=3.19.0"]
|
dependencies = ["leann-core==0.2.5", "numpy", "protobuf>=3.19.0"]
|
||||||
|
|
||||||
[tool.scikit-build]
|
[tool.scikit-build]
|
||||||
# Key: simplified CMake path
|
# Key: simplified CMake path
|
||||||
|
|||||||
@@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "leann-backend-hnsw"
|
name = "leann-backend-hnsw"
|
||||||
version = "0.2.2"
|
version = "0.2.5"
|
||||||
description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
|
description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"leann-core==0.2.2",
|
"leann-core==0.2.5",
|
||||||
"numpy",
|
"numpy",
|
||||||
"pyzmq>=23.0.0",
|
"pyzmq>=23.0.0",
|
||||||
"msgpack>=1.0.0",
|
"msgpack>=1.0.0",
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "leann-core"
|
name = "leann-core"
|
||||||
version = "0.2.2"
|
version = "0.2.5"
|
||||||
description = "Core API and plugin system for LEANN"
|
description = "Core API and plugin system for LEANN"
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
requires-python = ">=3.9"
|
requires-python = ">=3.9"
|
||||||
|
|||||||
@@ -17,12 +17,12 @@ logging.basicConfig(level=logging.INFO)
|
|||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def check_ollama_models() -> list[str]:
|
def check_ollama_models(host: str) -> list[str]:
|
||||||
"""Check available Ollama models and return a list"""
|
"""Check available Ollama models and return a list"""
|
||||||
try:
|
try:
|
||||||
import requests
|
import requests
|
||||||
|
|
||||||
response = requests.get("http://localhost:11434/api/tags", timeout=5)
|
response = requests.get(f"{host}/api/tags", timeout=5)
|
||||||
if response.status_code == 200:
|
if response.status_code == 200:
|
||||||
data = response.json()
|
data = response.json()
|
||||||
return [model["name"] for model in data.get("models", [])]
|
return [model["name"] for model in data.get("models", [])]
|
||||||
@@ -309,10 +309,12 @@ def search_hf_models(query: str, limit: int = 10) -> list[str]:
|
|||||||
return search_hf_models_fuzzy(query, limit)
|
return search_hf_models_fuzzy(query, limit)
|
||||||
|
|
||||||
|
|
||||||
def validate_model_and_suggest(model_name: str, llm_type: str) -> str | None:
|
def validate_model_and_suggest(
|
||||||
|
model_name: str, llm_type: str, host: str = "http://localhost:11434"
|
||||||
|
) -> str | None:
|
||||||
"""Validate model name and provide suggestions if invalid"""
|
"""Validate model name and provide suggestions if invalid"""
|
||||||
if llm_type == "ollama":
|
if llm_type == "ollama":
|
||||||
available_models = check_ollama_models()
|
available_models = check_ollama_models(host)
|
||||||
if available_models and model_name not in available_models:
|
if available_models and model_name not in available_models:
|
||||||
error_msg = f"Model '{model_name}' not found in your local Ollama installation."
|
error_msg = f"Model '{model_name}' not found in your local Ollama installation."
|
||||||
|
|
||||||
@@ -469,7 +471,7 @@ class OllamaChat(LLMInterface):
|
|||||||
requests.get(host)
|
requests.get(host)
|
||||||
|
|
||||||
# Pre-check model availability with helpful suggestions
|
# Pre-check model availability with helpful suggestions
|
||||||
model_error = validate_model_and_suggest(model, "ollama")
|
model_error = validate_model_and_suggest(model, "ollama", host)
|
||||||
if model_error:
|
if model_error:
|
||||||
raise ValueError(model_error)
|
raise ValueError(model_error)
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,6 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
|
|
||||||
import json
|
import json
|
||||||
import os
|
|
||||||
import subprocess
|
import subprocess
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
@@ -62,10 +61,6 @@ def handle_request(request):
|
|||||||
tool_name = request["params"]["name"]
|
tool_name = request["params"]["name"]
|
||||||
args = request["params"].get("arguments", {})
|
args = request["params"].get("arguments", {})
|
||||||
|
|
||||||
# Set working directory and environment
|
|
||||||
env = os.environ.copy()
|
|
||||||
cwd = "/Users/andyl/Projects/LEANN-RAG"
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
if tool_name == "leann_search":
|
if tool_name == "leann_search":
|
||||||
cmd = [
|
cmd = [
|
||||||
@@ -76,18 +71,14 @@ def handle_request(request):
|
|||||||
"--recompute-embeddings",
|
"--recompute-embeddings",
|
||||||
f"--top-k={args.get('top_k', 5)}",
|
f"--top-k={args.get('top_k', 5)}",
|
||||||
]
|
]
|
||||||
result = subprocess.run(cmd, capture_output=True, text=True, cwd=cwd, env=env)
|
result = subprocess.run(cmd, capture_output=True, text=True)
|
||||||
|
|
||||||
elif tool_name == "leann_ask":
|
elif tool_name == "leann_ask":
|
||||||
cmd = f'echo "{args["question"]}" | leann ask {args["index_name"]} --recompute-embeddings --llm ollama --model qwen3:8b'
|
cmd = f'echo "{args["question"]}" | leann ask {args["index_name"]} --recompute-embeddings --llm ollama --model qwen3:8b'
|
||||||
result = subprocess.run(
|
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
|
||||||
cmd, shell=True, capture_output=True, text=True, cwd=cwd, env=env
|
|
||||||
)
|
|
||||||
|
|
||||||
elif tool_name == "leann_list":
|
elif tool_name == "leann_list":
|
||||||
result = subprocess.run(
|
result = subprocess.run(["leann", "list"], capture_output=True, text=True)
|
||||||
["leann", "list"], capture_output=True, text=True, cwd=cwd, env=env
|
|
||||||
)
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"jsonrpc": "2.0",
|
"jsonrpc": "2.0",
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ Intelligent code assistance using LEANN's vector search directly in Claude Code.
|
|||||||
First, install LEANN CLI globally:
|
First, install LEANN CLI globally:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
uv tool install leann
|
uv tool install leann-core
|
||||||
```
|
```
|
||||||
|
|
||||||
This makes the `leann` command available system-wide, which `leann_mcp` requires.
|
This makes the `leann` command available system-wide, which `leann_mcp` requires.
|
||||||
@@ -30,7 +30,7 @@ claude mcp add leann-server -- leann_mcp
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Build an index for your project
|
# Build an index for your project
|
||||||
leann build my-project
|
leann build my-project --docs ./ #change to your doc PATH
|
||||||
|
|
||||||
# Start Claude Code
|
# Start Claude Code
|
||||||
claude
|
claude
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
|
|||||||
|
|
||||||
[project]
|
[project]
|
||||||
name = "leann"
|
name = "leann"
|
||||||
version = "0.2.2"
|
version = "0.2.5"
|
||||||
description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
|
description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
requires-python = ">=3.9"
|
requires-python = ">=3.9"
|
||||||
|
|||||||
Reference in New Issue
Block a user