* feat: Add Ollama embedding support for local embedding models
* docs: Add clear documentation for Ollama embedding usage
* fix: remove leann_ask
* docs: remove ollama embedding extra instructions
* simplify MCP interface for Claude Code
- Remove unnecessary search parameters: search_mode, recompute_embeddings, file_types, min_score
- Remove leann_clear tool (not needed for Claude Code workflow)
- Streamline search to only use: query, index_name, top_k, complexity
- Keep core tools: leann_index, leann_search, leann_status, leann_list
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* remove leann_index from MCP interface
Users should use CLI command 'leann build' to create indexes first.
MCP now only provides search functionality:
- leann_search: search existing indexes
- leann_status: check index health
- leann_list: list available indexes
This separates index creation (CLI) from search (Claude Code).
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* improve CLI with auto project name and .gitignore support
- Make index_name optional, auto-use current directory name
- Read .gitignore patterns and respect them during indexing
- Add _read_gitignore_patterns() to parse .gitignore files
- Add _should_exclude_file() for pattern matching
- Apply exclusion patterns to both PDF and general file processing
- Show helpful messages about gitignore usage
Now users can simply run: leann build
And it will use project name + respect .gitignore patterns.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
1. CI Logging Enhancements:
- Added comprehensive diagnostics with process tree, network listeners, file descriptors
- Added timestamps at every stage (before/during/after pytest)
- Added trap EXIT to always show diagnostics
- Added immediate process checks after pytest finishes
- Added sub-shell execution with immediate cleanup
2. Fixed Subprocess PIPE Blocking:
- Changed Colab mode from PIPE to DEVNULL to prevent blocking
- PIPE without reading can cause parent process to wait indefinitely
3. Pytest Session Hooks:
- Added pytest_sessionstart to log initial state
- Added pytest_sessionfinish for aggressive cleanup before exit
- Shows all child processes and their status
This should reveal exactly where the hang is happening.
* feat: Add Ollama embedding support for local embedding models
* docs: Add clear documentation for Ollama embedding usage
* feat: Enhance Ollama embedding with better error handling and concurrent processing
- Add intelligent model validation and suggestions (inspired by OllamaChat)
- Implement concurrent processing for better performance
- Add retry mechanism with timeout handling
- Provide user-friendly error messages with emojis
- Auto-detect and recommend embedding models
- Add text truncation for long texts
- Improve progress bar display logic
* docs: don't mention it in README
Based on excellent diagnostic suggestions, implemented multiple fixes:
1. Diagnostics:
- Added faulthandler to dump stack traces 10s before CI timeout
- Enhanced CI script with trap handler to show processes/network on timeout
- Added diag() function to capture pstree, processes, network listeners
2. ZMQ Socket Timeouts (critical fix):
- Added RCVTIMEO=1000ms and SNDTIMEO=1000ms to all client sockets
- Added IMMEDIATE=1 to avoid connection blocking
- Reduced searcher timeout from 30s to 5s
- This prevents infinite blocking on recv/send operations
3. Context.instance() Fix (major issue):
- NEVER call term() or destroy() on Context.instance()
- This was causing blocking as it waits for ALL sockets to close
- Now only set linger=0 without terminating
4. Enhanced Process Cleanup:
- Added _reap_children fixture for aggressive session-end cleanup
- Better recursive child process termination
- Added final wait to ensure cleanup completes
The 180s timeout was happening because:
- ZMQ recv() was blocking indefinitely without timeout
- Context.instance().term() was waiting for all sockets
- Child processes weren't being fully cleaned up
These changes should prevent the hanging completely.
Fixed the actual root cause instead of just masking it in tests:
1. Root Problem:
- C++ side's ZmqDistanceComputer creates ZMQ connections but doesn't clean them
- Python 3.9/3.13 are more sensitive to cleanup timing during shutdown
2. Core Fixes in SearcherBase and LeannSearcher:
- Added cleanup() method to BaseSearcher that cleans ZMQ and embedding server
- LeannSearcher.cleanup() now also handles ZMQ context cleanup
- Both HNSW and DiskANN searchers now properly delete C++ index objects
3. Backend-Specific Cleanup:
- HNSWSearcher.cleanup(): Deletes self.index to trigger C++ destructors
- DiskannSearcher.cleanup(): Deletes self._index and resets state
- Both force garbage collection after deletion
4. Test Infrastructure:
- Added auto_cleanup_searcher fixture for explicit resource management
- Global cleanup now more aggressive with ZMQ context destruction
This is the proper fix - cleaning up resources at the source, not just
working around the issue in tests. The hanging was caused by C++ side
ZMQ connections not being properly terminated when is_recompute=True.
Based on excellent analysis from user, implemented comprehensive fixes:
1. ZMQ Socket Cleanup:
- Set LINGER=0 on all ZMQ sockets (client and server)
- Use try-finally blocks to ensure socket.close() and context.term()
- Prevents blocking on exit when ZMQ contexts have pending operations
2. Global Test Cleanup:
- Added tests/conftest.py with session-scoped cleanup fixture
- Cleans up leftover ZMQ contexts and child processes after all tests
- Lists remaining threads for debugging
3. CI Improvements:
- Apply timeout to ALL Python versions on Linux (not just 3.13)
- Increased timeout to 180s for better reliability
- Added process cleanup (pkill) on timeout
4. Dependencies:
- Added psutil>=5.9.0 to test dependencies for process management
Root cause: Python 3.9/3.13 are more sensitive to cleanup timing during
interpreter shutdown. ZMQ's default LINGER=-1 was blocking exit, and
atexit handlers were unreliable for cleanup.
This should resolve the 'all tests pass but CI hangs' issue.
- Improve grammar and sentence structure in MCP section
- Add proper markdown image formatting with relative paths
- Optimize mcp_leann.png size (1.3MB -> 224KB)
- Update data description to be more specific about Chinese content
- Add flush=True to all print statements in convert_to_csr.py to prevent buffer deadlock
- Redirect embedding server stdout/stderr to DEVNULL in CI environment (CI=true)
- Fix timeout in embedding_server_manager.stop_server() final wait call
- Replace 'int | None' with 'Optional[int]' everywhere
- Replace 'subprocess.Popen | None' with 'Optional[subprocess.Popen]'
- Add Optional import to all affected files
- Update ruff target-version from py310 to py39
- The '|' syntax for Union types was introduced in Python 3.10 (PEP 604)
Fixes TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
- Add logging in DiskANN embedding server to show metadata_file_path
- Add debug logging in PassageManager to trace path resolution
- This will help identify why CI fails to find passage files
- Add GraphPartitioner class for advanced graph partitioning
- Add partition_graph_simple function for easy-to-use partitioning
- Add pybind11 dependency for C++ executable building
- Update __init__.py to export partition functions
- Include test scripts for partition functionality
The partition functionality allows optimizing disk-based indices
for better search performance and memory efficiency.
* docs: config guidance
* feat: add comprehensive configuration guide and update README
- Create docs/configuration-guide.md with detailed guidance on:
- Embedding model selection (small/medium/large)
- Index selection (HNSW vs DiskANN)
- LLM engine and model comparison
- Parameter tuning (build/search complexity, top-k)
- Performance optimization tips
- Deep dive into LEANN's recomputation feature
- Update README.md to link to the configuration guide
- Include latest 2025 model recommendations (Qwen3, DeepSeek-R1, O3-mini)
* chore: move evaluation data .gitattributes to correct location
* docs: Weaken DiskANN emphasis in README
- Change backend description to emphasize HNSW as default
- DiskANN positioned as optional for billion-scale datasets
- Simplify evaluation commands to be more generic
* docs: Adjust DiskANN positioning in features and roadmap
- features.md: Put HNSW/FAISS first as default, DiskANN as optional
- roadmap.md: Reorder to show HNSW integration before DiskANN
- Consistent with positioning DiskANN as advanced option for large-scale use
* docs: Improve configuration guide based on feedback
- List specific files in default data/ directory (2 AI papers, literature, tech report)
- Update examples to use English and better RAG-suitable queries
- Change full dataset reference to use --max-items -1
- Adjust small model guidance about upgrading to larger models when time allows
- Update top-k defaults to reflect actual default of 20
- Ensure consistent use of full model name Qwen/Qwen3-Embedding-0.6B
- Reorder optimization steps, move MLX to third position
- Remove incorrect chunk size tuning guidance
- Change README from 'Having trouble' to 'Need best practices'
* docs: Address all configuration guide feedback
- Fix grammar: 'If time is not a constraint' instead of 'time expense is not large'
- Highlight Qwen3-Embedding-0.6B performance (nearly OpenAI API level)
- Add OpenAI quick start section with configuration example
- Fold Cloud vs Local trade-offs into collapsible section
- Update HNSW as 'default and recommended for extreme low storage'
- Add DiskANN beta warning and explain PQ+rerank architecture
- Expand Ollama models: add qwen3:0.6b, 4b, 7b variants
- Note OpenAI as current default but recommend Ollama switch
- Add 'need to install extra software' warning for Ollama
- Remove incorrect latency numbers from search-complexity recommendations
* docs: add a link
- Add intelligent memory calculation based on data size and system specs
- search_memory_maximum: 1/10 of embedding size (controls PQ compression)
- build_memory_maximum: 50% of available RAM (controls sharding)
- Provides optimal balance between performance and memory usage
- Automatic fallback to default values if parameters are explicitly provided