Files
Andy Lee 0d448c4a41 docs: config guidance (#17)
* docs: config guidance

* feat: add comprehensive configuration guide and update README

- Create docs/configuration-guide.md with detailed guidance on:
  - Embedding model selection (small/medium/large)
  - Index selection (HNSW vs DiskANN)
  - LLM engine and model comparison
  - Parameter tuning (build/search complexity, top-k)
  - Performance optimization tips
  - Deep dive into LEANN's recomputation feature
- Update README.md to link to the configuration guide
- Include latest 2025 model recommendations (Qwen3, DeepSeek-R1, O3-mini)

* chore: move evaluation data .gitattributes to correct location

* docs: Weaken DiskANN emphasis in README

- Change backend description to emphasize HNSW as default
- DiskANN positioned as optional for billion-scale datasets
- Simplify evaluation commands to be more generic

* docs: Adjust DiskANN positioning in features and roadmap

- features.md: Put HNSW/FAISS first as default, DiskANN as optional
- roadmap.md: Reorder to show HNSW integration before DiskANN
- Consistent with positioning DiskANN as advanced option for large-scale use

* docs: Improve configuration guide based on feedback

- List specific files in default data/ directory (2 AI papers, literature, tech report)
- Update examples to use English and better RAG-suitable queries
- Change full dataset reference to use --max-items -1
- Adjust small model guidance about upgrading to larger models when time allows
- Update top-k defaults to reflect actual default of 20
- Ensure consistent use of full model name Qwen/Qwen3-Embedding-0.6B
- Reorder optimization steps, move MLX to third position
- Remove incorrect chunk size tuning guidance
- Change README from 'Having trouble' to 'Need best practices'

* docs: Address all configuration guide feedback

- Fix grammar: 'If time is not a constraint' instead of 'time expense is not large'
- Highlight Qwen3-Embedding-0.6B performance (nearly OpenAI API level)
- Add OpenAI quick start section with configuration example
- Fold Cloud vs Local trade-offs into collapsible section
- Update HNSW as 'default and recommended for extreme low storage'
- Add DiskANN beta warning and explain PQ+rerank architecture
- Expand Ollama models: add qwen3:0.6b, 4b, 7b variants
- Note OpenAI as current default but recommend Ollama switch
- Add 'need to install extra software' warning for Ollama
- Remove incorrect latency numbers from search-complexity recommendations

* docs: add a link
2025-08-04 22:50:32 -07:00

1.5 KiB

Detailed Features

🔥 Core Features

  • 🔄 Real-time Embeddings - Eliminate heavy embedding storage with dynamic computation using optimized ZMQ servers and highly optimized search paradigm (overlapping and batching) with highly optimized embedding engine
  • 📈 Scalable Architecture - Handles millions of documents on consumer hardware; the larger your dataset, the more LEANN can save
  • 🎯 Graph Pruning - Advanced techniques to minimize the storage overhead of vector search to a limited footprint
  • 🏗️ Pluggable Backends - HNSW/FAISS (default), with optional DiskANN for large-scale deployments

🛠️ Technical Highlights

  • 🔄 Recompute Mode - Highest accuracy scenarios while eliminating vector storage overhead
  • Zero-copy Operations - Minimize IPC overhead by transferring distances instead of embeddings
  • 🚀 High-throughput Embedding Pipeline - Optimized batched processing for maximum efficiency
  • 🎯 Two-level Search - Novel coarse-to-fine search overlap for accelerated query processing (optional)
  • 💾 Memory-mapped Indices - Fast startup with raw text mapping to reduce memory overhead
  • 🚀 MLX Support - Ultra-fast recompute/build with quantized embedding models, accelerating building and search (minimal example)

🎨 Developer Experience

  • Simple Python API - Get started in minutes
  • Extensible backend system - Easy to add new algorithms
  • Comprehensive examples - From basic usage to production deployment