docs: update README to use proper module imports for apps

- Change from 'python apps/xxx.py' to 'python -m apps.xxx' - More professional and pythonic module calling - Ensures proper module resolution and imports - Better separation between apps/ (production tools) and examples/ (demos)
merge
2025-08-03 23:05:48 -07:00 · 2025-08-03 23:02:45 -07:00 · 2025-08-03 23:02:12 -07:00 · 2025-08-03 23:02:06 -07:00 · 2025-08-03 22:42:16 -07:00 · 2025-08-03 22:41:20 -07:00
17 changed files with 42 additions and 367 deletions
--- a/README.md
+++ b/README.md
@@ -170,8 +170,6 @@ ollama pull llama3.2:1b
 LEANN provides flexible parameters for embedding models, search strategies, and data processing to fit your specific needs.
 📚 **Need configuration best practices?** Check our [Configuration Guide](docs/configuration-guide.md) for detailed optimization tips, model selection advice, and solutions to common issues like slow embeddings or poor search quality.
 <details>
 <summary><strong>📋 Click to expand: Common Parameters (Available in All Examples)</strong></summary>
@@ -516,7 +514,7 @@ Options:
 - **Dynamic batching:** Efficiently batch embedding computations for GPU utilization
 - **Two-level search:** Smart graph traversal that prioritizes promising nodes
-**Backends:** HNSW (default) for most use cases, with optional DiskANN support for billion-scale datasets.
+**Backends:** DiskANN or HNSW - pick what works for your data size.
 ## Benchmarks
@@ -536,7 +534,8 @@ Options:
 ```bash
 uv pip install -e ".[dev]"  # Install dev dependencies
-python benchmarks/run_evaluation.py    # Will auto-download evaluation data and run benchmarks
+python benchmarks/run_evaluation.py data/indices/dpr/dpr_diskann      # DPR dataset
 python benchmarks/run_evaluation.py data/indices/rpj_wiki/rpj_wiki.index  # Wikipedia
 ```
 The evaluation script downloads data automatically on first run. The last three results were tested with partial personal data, and you can reproduce them with your own data!
--- a/apps/document_rag.py
+++ b/apps/document_rag.py
@@ -99,9 +99,7 @@ if __name__ == "__main__":
    print("- 'What are the main techniques LEANN uses?'")
    print("- 'What is the technique DLPM?'")
    print("- 'Who does Elizabeth Bennet marry?'")
-    print(
+    print("- 'What is the problem of developing pan gu model? (盘古大模型开发中遇到什么问题?)'")
        "- 'What is the problem of developing pan gu model Huawei meets? (盘古大模型开发中遇到什么问题?)'"
    )
    print("\nOr run without --query for interactive mode\n")
    rag = DocumentRAG()
--- a/benchmarks/data/.gitattributes
+++ b/benchmarks/data/.gitattributes
--- a/docs/configuration-guide.md
+++ b/docs/configuration-guide.md
@@ -1,236 +0,0 @@
 # LEANN Configuration Guide
 This guide helps you optimize LEANN for different use cases and understand the trade-offs between various configuration options.
 ## Getting Started: Simple is Better
 When first trying LEANN, start with a small dataset to quickly validate your approach:
 **For document RAG**: The default `data/` directory works perfectly - includes 2 AI research papers, Pride and Prejudice literature, and a technical report
 ```bash
 python -m apps.document_rag --query "What techniques does LEANN use?"
 ```
 **For other data sources**: Limit the dataset size for quick testing
 ```bash
 # WeChat: Test with recent messages only
 python -m apps.wechat_rag --max-items 100 --query "What did we discuss about the project timeline?"
 # Browser history: Last few days
 python -m apps.browser_rag --max-items 500 --query "Find documentation about vector databases"
 # Email: Recent inbox
 python -m apps.email_rag --max-items 200 --query "Who sent updates about the deployment status?"
 ```
 Once validated, scale up gradually:
 - 100 documents → 1,000 → 10,000 → full dataset (`--max-items -1`)
 - This helps identify issues early before committing to long processing times
 ## Embedding Model Selection: Understanding the Trade-offs
 Based on our experience developing LEANN, embedding models fall into three categories:
 ### Small Models (< 100M parameters)
 **Example**: `sentence-transformers/all-MiniLM-L6-v2` (22M params)
 - **Pros**: Lightweight, fast for both indexing and inference
 - **Cons**: Lower semantic understanding, may miss nuanced relationships
 - **Use when**: Speed is critical, handling simple queries, interactive mode, or just experimenting with LEANN. If time is not a constraint, consider using a larger/better embedding model
 ### Medium Models (100M-500M parameters)
 **Example**: `facebook/contriever` (110M params), `BAAI/bge-base-en-v1.5` (110M params)
 - **Pros**: Balanced performance, good multilingual support, reasonable speed
 - **Cons**: Requires more compute than small models
 - **Use when**: Need quality results without extreme compute requirements, general-purpose RAG applications
 ### Large Models (500M+ parameters)
 **Example**: `Qwen/Qwen3-Embedding-0.6B` (600M params), `intfloat/multilingual-e5-large` (560M params)
 - **Pros**: Best semantic understanding, captures complex relationships, excellent multilingual support. **Qwen3-Embedding-0.6B achieves nearly OpenAI API performance!**
 - **Cons**: Slower inference, longer index build times
 - **Use when**: Quality is paramount and you have sufficient compute resources. **Highly recommended** for production use
 ### Quick Start: OpenAI Embeddings (Fastest Setup)
 For immediate testing without local model downloads:
 ```bash
 # Set OpenAI embeddings (requires OPENAI_API_KEY)
 --embedding-mode openai --embedding-model text-embedding-3-small
 ```
 <details>
 <summary><strong>Cloud vs Local Trade-offs</strong></summary>
 **OpenAI Embeddings** (`text-embedding-3-small/large`)
 - **Pros**: No local compute needed, consistently fast, high quality
 - **Cons**: Requires API key, costs money, data leaves your system, [known limitations with certain languages](https://yichuan-w.github.io/blog/lessons_learned_in_dev_leann/)
 - **When to use**: Prototyping, non-sensitive data, need immediate results
 **Local Embeddings**
 - **Pros**: Complete privacy, no ongoing costs, full control, can sometimes outperform OpenAI embeddings
 - **Cons**: Slower than cloud APIs, requires local compute resources
 - **When to use**: Production systems, sensitive data, cost-sensitive applications
 </details>
 ## Index Selection: Matching Your Scale
 ### HNSW (Hierarchical Navigable Small World)
 **Best for**: Small to medium datasets (< 10M vectors) - **Default and recommended for extreme low storage**
 - Full recomputation required
 - High memory usage during build phase
 - Excellent recall (95%+)
 ```bash
 # Optimal for most use cases
 --backend-name hnsw --graph-degree 32 --build-complexity 64
 ```
 ### DiskANN
 **Best for**: Large datasets (> 10M vectors, 10GB+ index size) - **⚠️ Beta version, still in active development**
 - Uses Product Quantization (PQ) for coarse filtering during graph traversal
 - Novel approach: stores only PQ codes, performs rerank with exact computation in final step
 - Implements a corner case of double-queue: prunes all neighbors and recomputes at the end
 ```bash
 # For billion-scale deployments
 --backend-name diskann --graph-degree 64 --build-complexity 128
 ```
 ## LLM Selection: Engine and Model Comparison
 ### LLM Engines
 **OpenAI** (`--llm openai`)
 - **Pros**: Best quality, consistent performance, no local resources needed
 - **Cons**: Costs money ($0.15-2.5 per million tokens), requires internet, data privacy concerns
 - **Models**: `gpt-4o-mini` (fast, cheap), `gpt-4o` (best quality), `o3-mini` (reasoning, not so expensive)
 - **Note**: Our current default, but we recommend switching to Ollama for most use cases
 **Ollama** (`--llm ollama`)
 - **Pros**: Fully local, free, privacy-preserving, good model variety
 - **Cons**: Requires local GPU/CPU resources, slower than cloud APIs, need to install extra [ollama app](https://github.com/ollama/ollama?tab=readme-ov-file#ollama) and pre-download models by `ollama pull`
 - **Models**: `qwen3:0.6b` (ultra-fast), `qwen3:1.7b` (balanced), `qwen3:4b` (good quality), `qwen3:7b` (high quality), `deepseek-r1:1.5b` (reasoning)
 **HuggingFace** (`--llm hf`)
 - **Pros**: Free tier available, huge model selection, direct model loading (vs Ollama's server-based approach)
 - **Cons**: More complex initial setup
 - **Models**: `Qwen/Qwen3-1.7B-FP8`
 ## Parameter Tuning Guide
 ### Search Complexity Parameters
 **`--build-complexity`** (index building)
 - Controls thoroughness during index construction
 - Higher = better recall but slower build
 - Recommendations:
  - 32: Quick prototyping
  - 64: Balanced (default)
  - 128: Production systems
  - 256: Maximum quality
 **`--search-complexity`** (query time)
 - Controls search thoroughness
 - Higher = better results but slower
 - Recommendations:
  - 16: Fast/Interactive search
  - 32: High quality with diversity
  - 64+: Maximum accuracy
 ### Top-K Selection
 **`--top-k`** (number of retrieved chunks)
 - More chunks = better context but slower LLM processing
 - Should be always smaller than `--search-complexity`
 - Guidelines:
  - 10-20: General questions (default: 20)
  - 30+: Complex multi-hop reasoning requiring comprehensive context
 **Trade-off formula**:
 - Retrieval time ∝ log(n) × search_complexity
 - LLM processing time ∝ top_k × chunk_size
 - Total context = top_k × chunk_size tokens
 ### Graph Degree (HNSW/DiskANN)
 **`--graph-degree`**
 - Number of connections per node in the graph
 - Higher = better recall but more memory
 - HNSW: 16-32 (default: 32)
 - DiskANN: 32-128 (default: 64)
 ## Performance Optimization Checklist
 ### If Embedding is Too Slow
 1. **Switch to smaller model**:
   ```bash
   # From large model
   --embedding-model Qwen/Qwen3-Embedding-0.6B
   # To small model
   --embedding-model sentence-transformers/all-MiniLM-L6-v2
   ```
 2. **Limit dataset size for testing**:
   ```bash
   --max-items 1000  # Process first 1k items only
   ```
 3. **Use MLX on Apple Silicon** (optional optimization):
   ```bash
   --embedding-mode mlx --embedding-model mlx-community/multilingual-e5-base-mlx
   ```
 ### If Search Quality is Poor
 1. **Increase retrieval count**:
   ```bash
   --top-k 30  # Retrieve more candidates
   ```
 2. **Upgrade embedding model**:
   ```bash
   # For English
   --embedding-model BAAI/bge-base-en-v1.5
   # For multilingual
   --embedding-model intfloat/multilingual-e5-large
   ```
 ## Understanding the Trade-offs
 Every configuration choice involves trade-offs:
 | Factor | Small/Fast | Large/Quality |
 |--------|------------|---------------|
 | Embedding Model | `all-MiniLM-L6-v2` | `Qwen/Qwen3-Embedding-0.6B` |
 | Chunk Size | 512 tokens | 128 tokens |
 | Index Type | HNSW | DiskANN |
 | LLM | `qwen3:1.7b` | `gpt-4o` |
 The key is finding the right balance for your specific use case. Start small and simple, measure performance, then scale up only where needed.
 ## Deep Dive: Critical Configuration Decisions
 ### When to Disable Recomputation
 LEANN's recomputation feature provides exact distance calculations but can be disabled for extreme QPS requirements:
 ```bash
 --no-recompute  # Disable selective recomputation
 ```
 **Trade-offs**:
 - **With recomputation** (default): Exact distances, best quality, higher latency, minimal storage (only stores metadata, recomputes embeddings on-demand)
 - **Without recomputation**: Must store full embeddings, significantly higher memory and storage usage (10-100x more), but faster search
 **Disable when**:
 - You have abundant storage and memory
 - Need extremely low latency (< 100ms)
 - Running a read-heavy workload where storage cost is acceptable
 ## Further Reading
 - [Lessons Learned Developing LEANN](https://yichuan-w.github.io/blog/lessons_learned_in_dev_leann/)
 - [LEANN Technical Paper](https://arxiv.org/abs/2506.08276)
 - [DiskANN Original Paper](https://papers.nips.cc/paper/2019/file/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Paper.pdf)
--- a/docs/features.md
+++ b/docs/features.md
@@ -5,7 +5,7 @@
 - **🔄 Real-time Embeddings** - Eliminate heavy embedding storage with dynamic computation using optimized ZMQ servers and highly optimized search paradigm (overlapping and batching) with highly optimized embedding engine
 - **📈 Scalable Architecture** - Handles millions of documents on consumer hardware; the larger your dataset, the more LEANN can save
 - **🎯 Graph Pruning** - Advanced techniques to minimize the storage overhead of vector search to a limited footprint
- **🏗️ Pluggable Backends** - HNSW/FAISS (default), with optional DiskANN for large-scale deployments
+- **🏗️ Pluggable Backends** - DiskANN, HNSW/FAISS with unified API
 ## 🛠️ Technical Highlights
 - **🔄 Recompute Mode** - Highest accuracy scenarios while eliminating vector storage overhead
--- a/docs/roadmap.md
+++ b/docs/roadmap.md
@@ -2,8 +2,8 @@
 ## 🎯 Q2 2025
 - [X] HNSW backend integration
 - [X] DiskANN backend with MIPS/L2/Cosine support
 - [X] HNSW backend integration
 - [X] Real-time embedding pipeline
 - [X] Memory-efficient graph pruning
--- a/packages/leann-backend-diskann/leann_backend_diskann/diskann_backend.py
+++ b/packages/leann-backend-diskann/leann_backend_diskann/diskann_backend.py
@@ -7,7 +7,6 @@ from pathlib import Path
 from typing import Any, Literal
 import numpy as np
 import psutil
 from leann.interface import (
    LeannBackendBuilderInterface,
    LeannBackendFactoryInterface,
@@ -85,43 +84,6 @@ def _write_vectors_to_bin(data: np.ndarray, file_path: Path):
        f.write(data.tobytes())
 def _calculate_smart_memory_config(data: np.ndarray) -> tuple[float, float]:
    """
    Calculate smart memory configuration for DiskANN based on data size and system specs.
    Args:
        data: The embedding data array
    Returns:
        tuple: (search_memory_maximum, build_memory_maximum) in GB
    """
    num_vectors, dim = data.shape
    # Calculate embedding storage size
    embedding_size_bytes = num_vectors * dim * 4  # float32 = 4 bytes
    embedding_size_gb = embedding_size_bytes / (1024**3)
    # search_memory_maximum: 1/10 of embedding size for optimal PQ compression
    # This controls Product Quantization size - smaller means more compression
    search_memory_gb = max(0.1, embedding_size_gb / 10)  # At least 100MB
    # build_memory_maximum: Based on available system RAM for sharding control
    # This controls how much memory DiskANN uses during index construction
    available_memory_gb = psutil.virtual_memory().available / (1024**3)
    total_memory_gb = psutil.virtual_memory().total / (1024**3)
    # Use 50% of available memory, but at least 2GB and at most 75% of total
    build_memory_gb = max(2.0, min(available_memory_gb * 0.5, total_memory_gb * 0.75))
    logger.info(
        f"Smart memory config - Data: {embedding_size_gb:.2f}GB, "
        f"Search mem: {search_memory_gb:.2f}GB (PQ control), "
        f"Build mem: {build_memory_gb:.2f}GB (sharding control)"
    )
    return search_memory_gb, build_memory_gb
@register_backend("diskann")
 class DiskannBackend(LeannBackendFactoryInterface):
    @staticmethod
@@ -159,16 +121,6 @@ class DiskannBuilder(LeannBackendBuilderInterface):
                f"Unsupported distance_metric '{build_kwargs.get('distance_metric', 'unknown')}'."
            )
        # Calculate smart memory configuration if not explicitly provided
        if (
            "search_memory_maximum" not in build_kwargs
            or "build_memory_maximum" not in build_kwargs
        ):
            smart_search_mem, smart_build_mem = _calculate_smart_memory_config(data)
        else:
            smart_search_mem = build_kwargs.get("search_memory_maximum", 4.0)
            smart_build_mem = build_kwargs.get("build_memory_maximum", 8.0)
        try:
            from . import _diskannpy as diskannpy  # type: ignore
@@ -179,8 +131,8 @@ class DiskannBuilder(LeannBackendBuilderInterface):
                    index_prefix,
                    build_kwargs.get("complexity", 64),
                    build_kwargs.get("graph_degree", 32),
-                    build_kwargs.get("search_memory_maximum", smart_search_mem),
+                    build_kwargs.get("search_memory_maximum", 4.0),
-                    build_kwargs.get("build_memory_maximum", smart_build_mem),
+                    build_kwargs.get("build_memory_maximum", 8.0),
                    build_kwargs.get("num_threads", 8),
                    build_kwargs.get("pq_disk_bytes", 0),
                    "",
--- a/packages/leann-backend-diskann/pyproject.toml
+++ b/packages/leann-backend-diskann/pyproject.toml
@@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build"
 [project]
 name = "leann-backend-diskann"
-version = "0.2.0"
+version = "0.1.16"
-dependencies = ["leann-core==0.2.0", "numpy", "protobuf>=3.19.0"]
+dependencies = ["leann-core==0.1.16", "numpy", "protobuf>=3.19.0"]
 [tool.scikit-build]
 # Key: simplified CMake path
--- a/packages/leann-backend-diskann/third_party/DiskANN
+++ b/packages/leann-backend-diskann/third_party/DiskANN
--- a/packages/leann-backend-hnsw/pyproject.toml
+++ b/packages/leann-backend-hnsw/pyproject.toml
@@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build"
 [project]
 name = "leann-backend-hnsw"
-version = "0.2.0"
+version = "0.1.16"
 description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
 dependencies = [
-    "leann-core==0.2.0",
+    "leann-core==0.1.16",
    "numpy",
    "pyzmq>=23.0.0",
    "msgpack>=1.0.0",
--- a/packages/leann-core/pyproject.toml
+++ b/packages/leann-core/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "leann-core"
-version = "0.2.0"
+version = "0.1.16"
 description = "Core API and plugin system for LEANN"
 readme = "README.md"
 requires-python = ">=3.9"
--- a/packages/leann-core/src/leann/api.py
+++ b/packages/leann-core/src/leann/api.py
@@ -636,10 +636,7 @@ class LeannChat:
            "Please provide the best answer you can based on this context and your knowledge."
        )
        ask_time = time.time()
        ans = self.llm.ask(prompt, **llm_kwargs)
        ask_time = time.time() - ask_time
        logger.info(f"  Ask time: {ask_time} seconds")
        return ans
    def start_interactive(self):
--- a/packages/leann-core/src/leann/chat.py
+++ b/packages/leann-core/src/leann/chat.py
@@ -358,11 +358,7 @@ def validate_model_and_suggest(model_name: str, llm_type: str) -> str | None:
                error_msg += f"\n\nModel '{model_name}' was not found in Ollama's library."
                if suggestions:
-                    error_msg += (
+                    error_msg += "\n\nDid you mean one of these installed models?\n"
                        "\n\nDid you mean one of these installed models?\n"
                        + "\nTry to use ollama pull to install the model you need\n"
                    )
                    for i, suggestion in enumerate(suggestions, 1):
                        error_msg += f"  {i}. {suggestion}\n"
                else:
@@ -546,41 +542,14 @@ class HFChat(LLMInterface):
            self.device = "cpu"
            logger.info("No GPU detected. Using CPU.")
-        # Load tokenizer and model with timeout protection
+        # Load tokenizer and model
-        try:
+        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
-            import signal
+        self.model = AutoModelForCausalLM.from_pretrained(
-
+            model_name,
-            def timeout_handler(signum, frame):
+            torch_dtype=torch.float16 if self.device != "cpu" else torch.float32,
-                raise TimeoutError("Model download/loading timed out")
+            device_map="auto" if self.device != "cpu" else None,
-
+            trust_remote_code=True,
-            # Set timeout for model loading (60 seconds)
+        )
            old_handler = signal.signal(signal.SIGALRM, timeout_handler)
            signal.alarm(60)
            try:
                logger.info(f"Loading tokenizer for {model_name}...")
                self.tokenizer = AutoTokenizer.from_pretrained(model_name)
                logger.info(f"Loading model {model_name}...")
                self.model = AutoModelForCausalLM.from_pretrained(
                    model_name,
                    torch_dtype=torch.float16 if self.device != "cpu" else torch.float32,
                    device_map="auto" if self.device != "cpu" else None,
                    trust_remote_code=True,
                )
                logger.info(f"Successfully loaded {model_name}")
            finally:
                signal.alarm(0)  # Cancel the alarm
                signal.signal(signal.SIGALRM, old_handler)  # Restore old handler
        except TimeoutError:
            logger.error(f"Model loading timed out for {model_name}")
            raise RuntimeError(
                f"Model loading timed out for {model_name}. Please check your internet connection or try a smaller model."
            )
        except Exception as e:
            logger.error(f"Failed to load model {model_name}: {e}")
            raise
        # Move model to device if not using device_map
        if self.device != "cpu" and "device_map" not in str(self.model):
--- a/packages/leann-core/src/leann/embedding_server_manager.py
+++ b/packages/leann-core/src/leann/embedding_server_manager.py
@@ -354,21 +354,13 @@ class EmbeddingServerManager:
        self.server_process.terminate()
        try:
-            self.server_process.wait(timeout=3)
+            self.server_process.wait(timeout=5)
            logger.info(f"Server process {self.server_process.pid} terminated.")
        except subprocess.TimeoutExpired:
            logger.warning(
-                f"Server process {self.server_process.pid} did not terminate gracefully within 3 seconds, killing it."
+                f"Server process {self.server_process.pid} did not terminate gracefully, killing it."
            )
            self.server_process.kill()
            try:
                self.server_process.wait(timeout=2)
                logger.info(f"Server process {self.server_process.pid} killed successfully.")
            except subprocess.TimeoutExpired:
                logger.error(
                    f"Failed to kill server process {self.server_process.pid} - it may be hung"
                )
                # Don't hang indefinitely
        # Clean up process resources to prevent resource tracker warnings
        try:
--- a/packages/leann/README.md
+++ b/packages/leann/README.md
@@ -5,8 +5,11 @@ LEANN is a revolutionary vector database that democratizes personal AI. Transfor
 ## Installation
 ```bash
-# Default installation (includes both HNSW and DiskANN backends)
+# Default installation (HNSW backend, recommended)
 uv pip install leann
 # With DiskANN backend (for large-scale deployments)
 uv pip install leann[diskann]
 ```
 ## Quick Start
@@ -16,8 +19,8 @@ from leann import LeannBuilder, LeannSearcher, LeannChat
 from pathlib import Path
 INDEX_PATH = str(Path("./").resolve() / "demo.leann")
-# Build an index (choose backend: "hnsw" or "diskann")
+# Build an index
-builder = LeannBuilder(backend_name="hnsw")  # or "diskann" for large-scale deployments
+builder = LeannBuilder(backend_name="hnsw")
 builder.add_text("LEANN saves 97% storage compared to traditional vector databases.")
 builder.add_text("Tung Tung Tung Sahur called—they need their banana‑crocodile hybrid back")
 builder.build_index(INDEX_PATH)
--- a/packages/leann/pyproject.toml
+++ b/packages/leann/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "leann"
-version = "0.2.0"
+version = "0.1.16"
 description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
 readme = "README.md"
 requires-python = ">=3.9"
@@ -24,15 +24,16 @@ classifiers = [
    "Programming Language :: Python :: 3.12",
 ]
-# Default installation: core + hnsw + diskann
+# Default installation: core + hnsw
 dependencies = [
    "leann-core>=0.1.0",
    "leann-backend-hnsw>=0.1.0",
    "leann-backend-diskann>=0.1.0",
 ]
 [project.optional-dependencies]
-# All backends now included by default
+diskann = [
    "leann-backend-diskann>=0.1.0",
 ]
 [project.urls]
 Repository = "https://github.com/yichuan-w/LEANN"
--- a/uv.lock
+++ b/uv.lock
@@ -1650,7 +1650,7 @@ name = "importlib-metadata"
 version = "8.7.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "zipp", marker = "python_full_version < '3.10'" },
+    { name = "zipp" },
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/76/66/650a33bd90f786193e4de4b3ad86ea60b53c89b669a5c7be931fac31cdb0/importlib_metadata-8.7.0.tar.gz", hash = "sha256:d13b81ad223b890aa16c5471f2ac3056cf76c5f10f82d6f9292f0b415f389000", size = 56641 }
 wheels = [
@@ -2155,7 +2155,7 @@ wheels = [
 [[package]]
 name = "leann-backend-diskann"
-version = "0.2.0"
+version = "0.1.15"
 source = { editable = "packages/leann-backend-diskann" }
 dependencies = [
    { name = "leann-core" },
@@ -2167,14 +2167,14 @@ dependencies = [
 [package.metadata]
 requires-dist = [
-    { name = "leann-core", specifier = "==0.2.0" },
+    { name = "leann-core", specifier = "==0.1.15" },
    { name = "numpy" },
    { name = "protobuf", specifier = ">=3.19.0" },
 ]
 [[package]]
 name = "leann-backend-hnsw"
-version = "0.2.0"
+version = "0.1.15"
 source = { editable = "packages/leann-backend-hnsw" }
 dependencies = [
    { name = "leann-core" },
@@ -2187,7 +2187,7 @@ dependencies = [
 [package.metadata]
 requires-dist = [
-    { name = "leann-core", specifier = "==0.2.0" },
+    { name = "leann-core", specifier = "==0.1.15" },
    { name = "msgpack", specifier = ">=1.0.0" },
    { name = "numpy" },
    { name = "pyzmq", specifier = ">=23.0.0" },
@@ -2195,7 +2195,7 @@ requires-dist = [
 [[package]]
 name = "leann-core"
-version = "0.2.0"
+version = "0.1.15"
 source = { editable = "packages/leann-core" }
 dependencies = [
    { name = "accelerate" },
Author	SHA1	Message	Date
Andy Lee	0877960547	docs: update README to use proper module imports for apps - Change from 'python apps/xxx.py' to 'python -m apps.xxx' - More professional and pythonic module calling - Ensures proper module resolution and imports - Better separation between apps/ (production tools) and examples/ (demos)	2025-08-03 23:05:48 -07:00
yichuan520030910320	d68af63d05	merge	2025-08-03 23:02:45 -07:00
yichuan520030910320	b844aca968	Merge branch 'refactor-app' of https://github.com/yichuan-w/LEANN into refactor-app	2025-08-03 23:02:12 -07:00
yichuan520030910320	85277ba67a	fix wechat	2025-08-03 23:02:06 -07:00
Andy Lee	e9562acdc2	fix: handle certificate errors in link checker	2025-08-03 22:42:16 -07:00
Andy Lee	7fd3db1ddb	fix: add init.py	2025-08-03 22:41:20 -07:00
Andy Lee	c1ccc51a75	refactor: reorganize examples and add link checker	2025-08-03 22:40:15 -07:00
Andy Lee	b0239b6e4d	refactor: reorgnize all examples/ and test/	2025-08-03 22:37:45 -07:00
yichuan520030910320	58556ef44c	merge	2025-08-03 22:29:30 -07:00
yichuan520030910320	87c930d705	fix email wrong -1 to process all file	2025-08-03 22:27:04 -07:00
Andy Lee	86f919a6da	fix: WeChat history reader bugs and refactor wechat_rag to use unified architecture	2025-08-03 21:54:40 -07:00
Andy Lee	f8d34663b4	feat: check if k is larger than #docs	2025-08-03 21:41:53 -07:00
yichuan520030910320	568cf597f4	fix some example	2025-08-03 21:19:05 -07:00
yichuan520030910320	baf70dc411	change rebuild logic	2025-08-03 20:54:52 -07:00
yichuan520030910320	7ad2ec39d6	add response highlight	2025-08-03 20:32:07 -07:00
Andy Lee	31fd3c816a	fix: update default embedding models for better performance - Change WeChat, Browser, and Email RAG examples to use all-MiniLM-L6-v2 - Previous Qwen/Qwen3-Embedding-0.6B was too slow for these use cases - all-MiniLM-L6-v2 is a fast 384-dim model, ideal for large-scale personal data	2025-08-02 19:04:59 -07:00
Andy Lee	1f6c7f2f5a	docs: Emphasize diverse data sources in examples/data description	2025-07-30 22:42:34 -07:00
Andy Lee	c1124eb349	feat: Update documentation based on review feedback - Add MLX embedding example to README - Clarify examples/data content description (two papers, Pride and Prejudice, Chinese README) - Move chunk parameters to common parameters section - Remove duplicate chunk parameters from document-specific section	2025-07-30 18:05:39 -07:00
Andy Lee	274bbb19ea	feat: Add chunk-size parameters and improve file type filtering - Add --chunk-size and --chunk-overlap parameters to all RAG examples - Preserve original default values for each data source: - Document: 256/128 (optimized for general documents) - Email: 256/25 (smaller overlap for email threads) - Browser: 256/128 (standard for web content) - WeChat: 192/64 (smaller chunks for chat messages) - Make --file-types optional filter instead of restriction in document_rag - Update README to clarify interactive mode and parameter usage - Fix LLM default model documentation (gpt-4o, not gpt-4o-mini)	2025-07-29 18:31:56 -07:00
Andy Lee	8c152c7a31	feat: Address review comments - Add complexity parameter to LeannChat initialization (default: search_complexity) - Fix chunk-size default in README documentation (256, not 2048) - Add more index building parameters as CLI arguments: - --backend-name (hnsw/diskann) - --graph-degree (default: 32) - --build-complexity (default: 64) - --no-compact (disable compact storage) - --no-recompute (disable embedding recomputation) - Update README to document all new parameters	2025-07-29 16:59:24 -07:00
Andy Lee	ce77eef13a	fix: Fix async/await and add_text issues in unified examples - Remove incorrect await from chat.ask() calls (not async) - Fix add_texts -> add_text method calls - Verify search-complexity correctly maps to efSearch parameter - All examples now run successfully	2025-07-29 16:00:58 -07:00
Andy Lee	9d77175ac8	fix: Fix issues in unified examples - Add smart path detection for data directory - Fix add_texts -> add_text method call - Handle both running from project root and examples directory	2025-07-29 15:55:46 -07:00
Andy Lee	7fbb6c98ef	docs: nit	2025-07-29 14:30:04 -07:00
Andy Lee	914a248c28	docs: Add introduction for Common Parameters section - Add 'Flexible Configuration' heading with descriptive sentence - Create parallel structure with 'Generation Model Setup' section - Improve document flow and readability	2025-07-29 14:16:33 -07:00
Andy Lee	55fc5862f9	docs: Fix collapsible sections - Make Common Parameters collapsible (as it's lengthy reference material) - Keep CLI Installation visible (important for users to see immediately) - Better information hierarchy	2025-07-29 14:14:26 -07:00
Andy Lee	fd97b8dfa8	style: format	2025-07-29 14:11:49 -07:00
Andy Lee	57959947a1	docs: Add collapsible section for CLI installation - Wrap CLI installation instructions in details/summary tags - Keep consistent with other collapsible sections in README - Improve document readability and navigation	2025-07-29 14:10:30 -07:00
Andy Lee	cc0c091ca5	docs: Clarify CLI global installation process - Explain the transition from venv to global installation - Add upgrade command for global installation - Make it clear that global install allows usage without venv activation	2025-07-29 14:06:16 -07:00
Andy Lee	ff389c7d8d	docs: Add CLI installation instructions - Add two installation options: venv and global uv tool - Clearly explain when to use each option - Make CLI more accessible for daily use	2025-07-29 14:05:33 -07:00
Andy Lee	6780a8eaba	docs: polish applications	2025-07-29 14:04:34 -07:00
Andy Lee	984056f126	docs: Reorganize parameter documentation structure - Move common parameters to a dedicated section before all examples - Rename sections to 'X-Specific Arguments' for clarity - Remove duplicate common parameters from individual examples - Better information architecture for users	2025-07-29 14:01:19 -07:00
Andy Lee	bd4451bf50	docs: Make example commands more representative - Add default values to parameter descriptions - Replace generic examples with real-world use cases - Focus on data-source-specific features in examples - Remove redundant demonstrations of common parameters	2025-07-29 13:59:29 -07:00
Andy Lee	34e313f64a	docs: Improve parameter categorization in README - Clearly separate core (shared) vs specific parameters - Move LLM and embedding examples to 'Example Commands' section - Add descriptive comments for all specific parameters - Keep only truly data-source-specific parameters in specific sections	2025-07-29 13:54:47 -07:00
Andy Lee	ddc789b231	fix: Restore embedding-mode parameter to all examples - All examples now have --embedding-mode parameter (unified interface benefit) - Default is 'sentence-transformers' (consistent with original behavior) - Users can now use OpenAI or MLX embeddings with any data source - Maintains functional equivalence with original scripts	2025-07-29 13:33:40 -07:00
Andy Lee	ff1b622bdd	refactor: Remove old example scripts and migration references - Delete old example scripts (mail_reader_leann.py, google_history_reader_leann.py, etc.) - Remove migration hints and backward compatibility - Update tests to use new unified examples directly - Clean up all references to old script names - Users now only see the new unified interface	2025-07-29 12:39:36 -07:00
Andy Lee	3cde4fc7b3	fix: Fix pre-commit issues and update tests - Fix import sorting and unused imports - Update type annotations to use built-in types (list, dict) instead of typing.List/Dict - Fix trailing whitespace and end-of-file issues - Fix Chinese fullwidth comma to regular comma - Update test_main_cli.py to test_document_rag.py - Add backward compatibility test for main_cli_example.py - Pass all pre-commit hooks (ruff, ruff-format, etc.)	2025-07-29 10:19:05 -07:00
Andy Lee	4e3bcda5fa	fix: Update CI tests for new unified examples interface - Rename test_main_cli.py to test_document_rag.py - Update all references from main_cli_example.py to document_rag.py - Update tests/README.md documentation The tests now properly test the new unified interface while maintaining the same test coverage and functionality.	2025-07-28 23:16:51 -07:00
Andy Lee	46f6f76fc3	refactor: Unify examples interface with BaseRAGExample - Create BaseRAGExample base class for all RAG examples - Refactor 4 examples to use unified interface: - document_rag.py (replaces main_cli_example.py) - email_rag.py (replaces mail_reader_leann.py) - browser_rag.py (replaces google_history_reader_leann.py) - wechat_rag.py (replaces wechat_history_reader_leann.py) - Maintain 100% parameter compatibility with original files - Add interactive mode support for all examples - Unify parameter names (--max-items replaces --max-emails/--max-entries) - Update README.md with new examples usage - Add PARAMETER_CONSISTENCY.md documenting all parameter mappings - Keep main_cli_example.py for backward compatibility with migration notice All default values, LeannBuilder parameters, and chunking settings remain identical to ensure full compatibility with existing indexes.	2025-07-28 23:11:16 -07:00