Commit Graph

183 Commits

Author SHA1 Message Date
Andy Lee
d9e5d5d6aa Merge branch 'main' into feature/graph-partition-support 2025-08-11 01:46:31 -07:00
GitHub Actions
239e35e2e6 chore: release v0.2.7 2025-08-11 03:11:46 +00:00
Andy Lee
2fac0c6fbf fix: improve gitignore and Jupyter notebook support (#28)
- Add nbconvert dependency for .ipynb file support
- Replace manual gitignore parsing with gitignore-parser library
- Proper recursive .gitignore handling (all subdirectories)
- Fix compliance with Git gitignore behavior
- Simplify code and improve reliability

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-10 20:02:46 -07:00
GitHub Actions
5e97916608 chore: release v0.2.6 2025-08-10 03:39:45 +00:00
Andy Lee
8b9c2be8c9 Feat/claude code refine (#24)
* feat: Add Ollama embedding support for local embedding models

* docs: Add clear documentation for Ollama embedding usage

* fix: remove leann_ask

* docs: remove ollama embedding extra instructions

* simplify MCP interface for Claude Code

- Remove unnecessary search parameters: search_mode, recompute_embeddings, file_types, min_score
- Remove leann_clear tool (not needed for Claude Code workflow)
- Streamline search to only use: query, index_name, top_k, complexity
- Keep core tools: leann_index, leann_search, leann_status, leann_list

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* remove leann_index from MCP interface

Users should use CLI command 'leann build' to create indexes first.
MCP now only provides search functionality:
- leann_search: search existing indexes
- leann_status: check index health
- leann_list: list available indexes

This separates index creation (CLI) from search (Claude Code).

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* improve CLI with auto project name and .gitignore support

- Make index_name optional, auto-use current directory name
- Read .gitignore patterns and respect them during indexing
- Add _read_gitignore_patterns() to parse .gitignore files
- Add _should_exclude_file() for pattern matching
- Apply exclusion patterns to both PDF and general file processing
- Show helpful messages about gitignore usage

Now users can simply run: leann build
And it will use project name + respect .gitignore patterns.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-09 20:37:17 -07:00
Andy Lee
439debbd3f fix: add extensive logging and fix subprocess PIPE blocking
1. CI Logging Enhancements:
   - Added comprehensive diagnostics with process tree, network listeners, file descriptors
   - Added timestamps at every stage (before/during/after pytest)
   - Added trap EXIT to always show diagnostics
   - Added immediate process checks after pytest finishes
   - Added sub-shell execution with immediate cleanup

2. Fixed Subprocess PIPE Blocking:
   - Changed Colab mode from PIPE to DEVNULL to prevent blocking
   - PIPE without reading can cause parent process to wait indefinitely

3. Pytest Session Hooks:
   - Added pytest_sessionstart to log initial state
   - Added pytest_sessionfinish for aggressive cleanup before exit
   - Shows all child processes and their status

This should reveal exactly where the hang is happening.
2025-08-08 18:55:50 -07:00
Andy Lee
3ff5aac8e0 Add Ollama embedding support to enable local embedding models (#22)
* feat: Add Ollama embedding support for local embedding models

* docs: Add clear documentation for Ollama embedding usage

* feat: Enhance Ollama embedding with better error handling and concurrent processing

- Add intelligent model validation and suggestions (inspired by OllamaChat)
- Implement concurrent processing for better performance
- Add retry mechanism with timeout handling
- Provide user-friendly error messages with emojis
- Auto-detect and recommend embedding models
- Add text truncation for long texts
- Improve progress bar display logic

* docs: don't mention it in README
2025-08-08 18:44:07 -07:00
Andy Lee
a35bfb0354 fix: comprehensive ZMQ timeout and cleanup fixes based on detailed analysis
Based on excellent diagnostic suggestions, implemented multiple fixes:

1. Diagnostics:
   - Added faulthandler to dump stack traces 10s before CI timeout
   - Enhanced CI script with trap handler to show processes/network on timeout
   - Added diag() function to capture pstree, processes, network listeners

2. ZMQ Socket Timeouts (critical fix):
   - Added RCVTIMEO=1000ms and SNDTIMEO=1000ms to all client sockets
   - Added IMMEDIATE=1 to avoid connection blocking
   - Reduced searcher timeout from 30s to 5s
   - This prevents infinite blocking on recv/send operations

3. Context.instance() Fix (major issue):
   - NEVER call term() or destroy() on Context.instance()
   - This was causing blocking as it waits for ALL sockets to close
   - Now only set linger=0 without terminating

4. Enhanced Process Cleanup:
   - Added _reap_children fixture for aggressive session-end cleanup
   - Better recursive child process termination
   - Added final wait to ensure cleanup completes

The 180s timeout was happening because:
- ZMQ recv() was blocking indefinitely without timeout
- Context.instance().term() was waiting for all sockets
- Child processes weren't being fully cleaned up

These changes should prevent the hanging completely.
2025-08-08 18:29:09 -07:00
Andy Lee
a6dad47280 fix: address root cause of test hanging - improper ZMQ/C++ resource cleanup
Fixed the actual root cause instead of just masking it in tests:

1. Root Problem:
   - C++ side's ZmqDistanceComputer creates ZMQ connections but doesn't clean them
   - Python 3.9/3.13 are more sensitive to cleanup timing during shutdown

2. Core Fixes in SearcherBase and LeannSearcher:
   - Added cleanup() method to BaseSearcher that cleans ZMQ and embedding server
   - LeannSearcher.cleanup() now also handles ZMQ context cleanup
   - Both HNSW and DiskANN searchers now properly delete C++ index objects

3. Backend-Specific Cleanup:
   - HNSWSearcher.cleanup(): Deletes self.index to trigger C++ destructors
   - DiskannSearcher.cleanup(): Deletes self._index and resets state
   - Both force garbage collection after deletion

4. Test Infrastructure:
   - Added auto_cleanup_searcher fixture for explicit resource management
   - Global cleanup now more aggressive with ZMQ context destruction

This is the proper fix - cleaning up resources at the source, not just
working around the issue in tests. The hanging was caused by C++ side
ZMQ connections not being properly terminated when is_recompute=True.
2025-08-08 17:54:03 -07:00
yichuan520030910320
67fef60466 [Readme]More about claude code 2025-08-08 16:05:35 -07:00
Andy Lee
131f10b286 Merge branch 'main' into feature/graph-partition-support 2025-08-08 16:02:54 -07:00
Andy Lee
e3762458fc fix: prevent test runner hanging on Python 3.9/3.13 due to ZMQ and process cleanup issues
Based on excellent analysis from user, implemented comprehensive fixes:

1. ZMQ Socket Cleanup:
   - Set LINGER=0 on all ZMQ sockets (client and server)
   - Use try-finally blocks to ensure socket.close() and context.term()
   - Prevents blocking on exit when ZMQ contexts have pending operations

2. Global Test Cleanup:
   - Added tests/conftest.py with session-scoped cleanup fixture
   - Cleans up leftover ZMQ contexts and child processes after all tests
   - Lists remaining threads for debugging

3. CI Improvements:
   - Apply timeout to ALL Python versions on Linux (not just 3.13)
   - Increased timeout to 180s for better reliability
   - Added process cleanup (pkill) on timeout

4. Dependencies:
   - Added psutil>=5.9.0 to test dependencies for process management

Root cause: Python 3.9/3.13 are more sensitive to cleanup timing during
interpreter shutdown. ZMQ's default LINGER=-1 was blocking exit, and
atexit handlers were unreliable for cleanup.

This should resolve the 'all tests pass but CI hangs' issue.
2025-08-08 15:57:22 -07:00
GitHub Actions
b6ab6f1993 chore: release v0.2.5 2025-08-08 22:32:27 +00:00
joshuashaffer
9f2e82a838 Propagate hosts argument for ollama through chat.py (#21)
* Propigate hosts argument for ollama through chat.py

* Apply suggestions from code review

Good AI slop suggestions.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-08 15:31:15 -07:00
yichuan520030910320
0f790fbbd9 docs: polish README and add optimized MCP integration image
- Improve grammar and sentence structure in MCP section
- Add proper markdown image formatting with relative paths
- Optimize mcp_leann.png size (1.3MB -> 224KB)
- Update data description to be more specific about Chinese content
2025-08-08 00:58:36 -07:00
GitHub Actions
387ae21eba chore: release v0.2.4 2025-08-08 07:14:51 +00:00
Andy Lee
3cc329c3e7 fix: remove hardcoded paths from MCP server and documentation 2025-08-08 00:08:56 -07:00
Andy Lee
a8421c0475 Merge branch 'main' into feature/graph-partition-support 2025-08-07 23:57:28 -07:00
Andy Lee
777b5fed01 fix: remove hardcoded paths from MCP server and documentation 2025-08-07 23:56:01 -07:00
Andy Lee
440ad6e816 fix: resolve CI hanging by removing problematic wait() in stop_server 2025-08-07 23:55:56 -07:00
Andy Lee
5567302316 feat: promote Claude Code integration as primary RAG feature 2025-08-07 23:19:19 -07:00
Andy Lee
8714472cd8 fix: prevent hang in CI by flushing print statements and redirecting embedding server output
- Add flush=True to all print statements in convert_to_csr.py to prevent buffer deadlock
- Redirect embedding server stdout/stderr to DEVNULL in CI environment (CI=true)
- Fix timeout in embedding_server_manager.stop_server() final wait call
2025-08-07 21:53:58 -07:00
GitHub Actions
075d4bd167 chore: release v0.2.2 2025-08-08 01:58:40 +00:00
yichuan520030910320
e4bcc76f88 fix cli & make recompute default true 2025-08-07 18:58:04 -07:00
yichuan520030910320
710e83b1fd fix cli if there is no other type of doc to make it robust 2025-08-07 18:46:05 -07:00
Andy Lee
c799d61a5a fix: add timeout to final wait() in stop_server to prevent infinite hang 2025-08-07 18:40:57 -07:00
yichuan520030910320
c96d653072 more support for type of docs in cli 2025-08-07 18:14:03 -07:00
Andy Lee
e409933149 chore: keep embedding server stdout/stderr visible; still use new session and pg-kill on stop 2025-08-07 17:55:42 -07:00
Andy Lee
bc31876a9f style: organize imports; fix process-group stop for embedding server 2025-08-07 17:54:26 -07:00
Andy Lee
e421c44b8b fix(py39): remove zip(strict=...) usage in api; Python 3.9 compatibility 2025-08-07 15:50:07 -07:00
Andy Lee
af69aa0508 fix(py39): replace remaining '| None' in diskann graph_partition (module-level function) 2025-08-07 15:28:29 -07:00
Andy Lee
575b354976 style: organize imports per ruff; finish py39 Optional changes
- Fix import ordering in embedding servers and graph_partition_simple
- Remove duplicate Optional import
- Complete Optional[...] replacements
2025-08-07 15:06:25 -07:00
Andy Lee
65bbff1d93 fix(py39): replace union type syntax in chat.py
- validate_model_and_suggest: str | None -> Optional[str]
- OpenAIChat.__init__: api_key: str | None -> Optional[str]
- get_llm: dict[str, Any] | None -> Optional[dict[str, Any]]

Ensures Python 3.9 compatibility for CI macOS 3.9.
2025-08-07 15:01:09 -07:00
Andy Lee
677eb0bae3 fix: Python 3.9 compatibility - replace Union type syntax
- Replace 'int | None' with 'Optional[int]' everywhere
- Replace 'subprocess.Popen | None' with 'Optional[subprocess.Popen]'
- Add Optional import to all affected files
- Update ruff target-version from py310 to py39
- The '|' syntax for Union types was introduced in Python 3.10 (PEP 604)

Fixes TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
2025-08-07 12:54:16 -07:00
Andy Lee
45bdad4fa7 debug: add detailed logging for CI path resolution debugging
- Add logging in DiskANN embedding server to show metadata_file_path
- Add debug logging in PassageManager to trace path resolution
- This will help identify why CI fails to find passage files
2025-08-07 00:00:12 -07:00
Andy Lee
d217adbe40 fix: diskann building and partitioning 2025-08-06 21:32:03 -07:00
Andy Lee
b8da9d7b12 docs: tool cli install 2025-08-06 21:28:05 -07:00
Andy Lee
0cb0463929 fix: always use relative path in metadata 2025-08-06 21:27:43 -07:00
yichuan520030910320
b982241249 add a path related fix 2025-08-05 23:35:48 -07:00
yichuan520030910320
c66f197e1d ruff 2025-08-05 23:24:55 -07:00
yichuan520030910320
4a1353761a merge 2025-08-05 23:23:07 -07:00
yichuan520030910320
a72090d2ab merge 2025-08-05 23:22:48 -07:00
yichuan520030910320
669e622430 chore: Update DiskANN submodule to latest with graph partition tools
- Update DiskANN submodule to commit b2dc4ea
- Includes graph partition tools and CMake integration
- Enables graph partitioning functionality in DiskANN backend
2025-08-05 23:14:19 -07:00
yichuan520030910320
77d7b60a61 feat: Add graph partition support for DiskANN backend
- Add GraphPartitioner class for advanced graph partitioning
- Add partition_graph_simple function for easy-to-use partitioning
- Add pybind11 dependency for C++ executable building
- Update __init__.py to export partition functions
- Include test scripts for partition functionality

The partition functionality allows optimizing disk-based indices
for better search performance and memory efficiency.
2025-08-05 23:11:09 -07:00
Andy Lee
8b22d2b5d3 Merge pull request #19 from yichuan-w/feature/claude-code-research
Feature/claude code research
2025-08-05 23:02:34 -07:00
yichuan520030910320
f94ce63d51 add gpt oss! serve your RAG using ollama 2025-08-05 16:49:52 -07:00
GitHub Actions
4271ff9d84 chore: release v0.2.1 2025-08-05 05:50:56 +00:00
Andy Lee
0d448c4a41 docs: config guidance (#17)
* docs: config guidance

* feat: add comprehensive configuration guide and update README

- Create docs/configuration-guide.md with detailed guidance on:
  - Embedding model selection (small/medium/large)
  - Index selection (HNSW vs DiskANN)
  - LLM engine and model comparison
  - Parameter tuning (build/search complexity, top-k)
  - Performance optimization tips
  - Deep dive into LEANN's recomputation feature
- Update README.md to link to the configuration guide
- Include latest 2025 model recommendations (Qwen3, DeepSeek-R1, O3-mini)

* chore: move evaluation data .gitattributes to correct location

* docs: Weaken DiskANN emphasis in README

- Change backend description to emphasize HNSW as default
- DiskANN positioned as optional for billion-scale datasets
- Simplify evaluation commands to be more generic

* docs: Adjust DiskANN positioning in features and roadmap

- features.md: Put HNSW/FAISS first as default, DiskANN as optional
- roadmap.md: Reorder to show HNSW integration before DiskANN
- Consistent with positioning DiskANN as advanced option for large-scale use

* docs: Improve configuration guide based on feedback

- List specific files in default data/ directory (2 AI papers, literature, tech report)
- Update examples to use English and better RAG-suitable queries
- Change full dataset reference to use --max-items -1
- Adjust small model guidance about upgrading to larger models when time allows
- Update top-k defaults to reflect actual default of 20
- Ensure consistent use of full model name Qwen/Qwen3-Embedding-0.6B
- Reorder optimization steps, move MLX to third position
- Remove incorrect chunk size tuning guidance
- Change README from 'Having trouble' to 'Need best practices'

* docs: Address all configuration guide feedback

- Fix grammar: 'If time is not a constraint' instead of 'time expense is not large'
- Highlight Qwen3-Embedding-0.6B performance (nearly OpenAI API level)
- Add OpenAI quick start section with configuration example
- Fold Cloud vs Local trade-offs into collapsible section
- Update HNSW as 'default and recommended for extreme low storage'
- Add DiskANN beta warning and explain PQ+rerank architecture
- Expand Ollama models: add qwen3:0.6b, 4b, 7b variants
- Note OpenAI as current default but recommend Ollama switch
- Add 'need to install extra software' warning for Ollama
- Remove incorrect latency numbers from search-complexity recommendations

* docs: add a link
2025-08-04 22:50:32 -07:00
yichuan520030910320
efdf6d917a fix diskann for faster mode 2025-08-04 17:46:46 -07:00
Andy Lee
dd71ac8d71 feat: implement smart memory configuration for DiskANN (#16)
- Add intelligent memory calculation based on data size and system specs
- search_memory_maximum: 1/10 of embedding size (controls PQ compression)
- build_memory_maximum: 50% of available RAM (controls sharding)
- Provides optimal balance between performance and memory usage
- Automatic fallback to default values if parameters are explicitly provided
2025-08-04 14:36:29 -07:00