- Add Python faulthandler integration with signal-triggered stack dumps
- Implement periodic stack dumps at 5min and 10min intervals
- Add external process monitoring with SIGUSR1 signal on hang detection
- Use debug_pytest.py wrapper to capture exact hang location in C++ cleanup
- Enhance CPU stability monitoring to trigger precise stack traces
This addresses the persistent pytest hanging issue in Ubuntu 22.04 CI by
providing detailed stack traces to identify the exact code location where
the hang occurs during test cleanup phase.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fix Python code formatting in YAML (pre-commit fixed indentation issues)
- Add comprehensive post-pytest cleanup monitoring
- Monitor for hanging processes after test completion
- Focus on teardown phase based on previous hang analysis
This addresses the root cause identified: hang occurs after tests pass,
likely during cleanup/teardown of C++ extensions or embedding servers.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove debug_enabled input parameter that no longer exists in build-reusable.yml
- Keep workflow_dispatch trigger but without debug options
- Fixes workflow validation error: 'debug_enabled is not defined'
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Increase pytest timeout from 300s to 600s for thorough testing
- Increase import testing timeout from 60s to 120s
- Allow more time for C++ extension loading (faiss/diskann)
- Still provides timeout protection against infinite hangs
This gives the system more time to complete imports and tests
while still catching genuine hangs that exceed reasonable limits.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove all upterm/tmate SSH debugging infrastructure
- Restore clean CI workflow from main branch
- Remove diagnostic script that was only for SSH debugging
- Keep valuable DiskANN and HNSW backend improvements
This provides a clean base to add targeted pytest hang debugging
without the complexity of SSH sessions.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
The debug branch had updated DiskANN submodule to a version with
hardcoded OpenMP paths that break macOS 13 builds. This reverts
to the stable version used in main branch.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add proper wait and retry logic for tmate initialization
- Tmate needs time to connect to servers before showing SSH info
- Try multiple times with delays to get connection details
* ci: add Mac Intel (x86_64) build support
* fix: auto-detect Homebrew path for Intel vs Apple Silicon Macs
This fixes the hardcoded /opt/homebrew path which only works on Apple
Silicon Macs. Intel Macs use /usr/local as the Homebrew prefix.
* fix: auto-detect Homebrew paths for both DiskANN and HNSW backends
- Fix DiskANN CMakeLists.txt path reference
- Add macOS environment variable detection for OpenMP_ROOT
- Support both Intel (/usr/local) and Apple Silicon (/opt/homebrew) paths
* fix: improve macOS build reliability with proper OpenMP path detection
- Add proper CMAKE_PREFIX_PATH and OpenMP_ROOT detection for both Intel and Apple Silicon Macs
- Set LDFLAGS and CPPFLAGS for all Homebrew packages to ensure CMake can find them
- Apply CMAKE_ARGS to both HNSW and DiskANN backends for consistent builds
- Fix hardcoded paths that caused build failures on Intel Macs (macos-13)
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: add abseil library path for protobuf compilation on macOS
- Include abseil in CMAKE_PREFIX_PATH for both Intel and Apple Silicon Macs
- Add explicit absl_DIR CMake variable to help find abseil for protobuf
- Fixes 'absl/log/absl_log.h' file not found error during compilation
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: add abseil include path to CPPFLAGS for both Intel and Apple Silicon
- Add -I/opt/homebrew/opt/abseil/include to CPPFLAGS for Apple Silicon
- Add -I/usr/local/opt/abseil/include to CPPFLAGS for Intel
- Fixes 'absl/log/absl_log.h' file not found by ensuring abseil headers are in compiler include path
Root cause: CMAKE_PREFIX_PATH alone wasn't sufficient - compiler needs explicit -I flags
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: clean build system and Python 3.9 compatibility
Build system improvements:
- Simplify macOS environment detection using brew --prefix
- Remove complex hardcoded paths and CMAKE_ARGS
- Let CMake automatically find Homebrew packages via CMAKE_PREFIX_PATH
- Clean separation between Intel (/usr/local) and Apple Silicon (/opt/homebrew)
Python 3.9 compatibility:
- Set ruff target-version to py39 to match project requirements
- Replace str | None with Union[str, None] in type annotations
- Add Union imports where needed
- Fix core interface, CLI, chat, and embedding server files
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: type
* fix: ensure CMAKE_PREFIX_PATH is passed to backend builds
- Add CMAKE_ARGS with CMAKE_PREFIX_PATH and OpenMP_ROOT for both HNSW and DiskANN backends
- This ensures CMake can find Homebrew packages on both Intel (/usr/local) and Apple Silicon (/opt/homebrew)
- Fixes the issue where CMake was still looking for hardcoded paths instead of using detected ones
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: configure CMake paths in pyproject.toml for proper Homebrew detection
- Add CMAKE_PREFIX_PATH and OpenMP_ROOT environment variable mapping in both backends
- Remove CMAKE_ARGS from GitHub Actions workflow (cleaner separation)
- Ensure scikit-build-core correctly uses environment variables for CMake configuration
- This should fix the hardcoded /opt/homebrew paths on Intel Macs
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: remove hardcoded /opt/homebrew paths from DiskANN CMake
- Auto-detect Homebrew libomp path using OpenMP_ROOT environment variable
- Fallback to CMAKE_PREFIX_PATH/opt/libomp if OpenMP_ROOT not set
- Final fallback to brew --prefix libomp for auto-detection
- Maintains backwards compatibility with old hardcoded path
- Fixes Intel Mac builds that were failing due to hardcoded Apple Silicon paths
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: update DiskANN submodule with macOS Intel/Apple Silicon compatibility fixes
- Auto-detect Homebrew libomp path using OpenMP_ROOT environment variable
- Exclude mkl_set_num_threads on macOS (uses Accelerate framework instead of MKL)
- Fixes compilation on Intel Macs by using correct /usr/local paths
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: update DiskANN submodule with SIMD function name corrections
- Fix _mm128_loadu_ps to _mm_loadu_ps (and similar functions)
- This is a known issue in upstream DiskANN code where incorrect function names were used
- Resolves compilation errors on macOS Intel builds
References: Known DiskANN issue with SIMD intrinsics naming
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: update DiskANN submodule with type cast fix for signed char templates
- Add missing type casts (float*)a and (float*)b in SSE2 version
- This matches the existing type casts in the AVX version
- Fixes compilation error when instantiating DistanceInnerProduct<int8_t>
- Resolves "cannot initialize const float* with const signed char*" error
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: update Faiss submodule with override keyword fix
- Add missing override keyword to IDSelectorModulo::is_member function
- Fixes C++ compilation warning that was treated as error due to -Werror flag
- Resolves "warning: 'is_member' overrides a member function but is not marked 'override'"
- Improves code conformance to modern C++ best practices
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: update Faiss submodule with override keyword fix
* fix: update DiskANN submodule with additional type cast fix
- Add missing type cast in DistanceFastL2::norm function SSE2 version
- Fixes const float* = const signed char* compilation error
- Ensures consistent type casting across all SIMD code paths
- Resolves template instantiation error for DistanceFastL2<int8_t>
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* debug: simplify wheel compatibility checking
- Fix YAML syntax error in debug step
- Use simpler approach to show platform tags and wheel names
- This will help identify platform tag compatibility issues
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: use correct Python version for wheel builds
- Replace --python python with --python ${{ matrix.python }}
- This ensures wheels are built for the correct Python version in each matrix job
- Fixes Python version mismatch where cp39 wheels were used in cp311 environments
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: resolve wheel installation conflicts in CI matrix builds
Fix issue where multiple Python versions' wheels in the same dist directory
caused installation conflicts during CI testing. The problem occurred when
matrix builds for different Python versions accumulated wheels in shared
directories, and uv pip install would find incompatible wheels.
Changes:
- Add Python version detection using matrix.python variable
- Convert Python version to wheel tag format (e.g., 3.11 -> cp311)
- Use find with version-specific pattern matching to select correct wheels
- Add explicit error handling if no matching wheel is found
This ensures each CI job installs only wheels compatible with its specific
Python version, preventing "A path dependency is incompatible with the
current platform" errors.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: ensure virtual environment uses correct Python version in CI
Fix issue where uv venv was creating virtual environments with a different
Python version than specified in the matrix, causing wheel compatibility
errors. The problem occurred when the system had multiple Python versions
and uv venv defaulted to a different version than intended.
Changes:
- Add --python ${{ matrix.python }} flag to uv venv command
- Ensures virtual environment matches the matrix-specified Python version
- Fixes "The wheel is compatible with CPython 3.X but you're using CPython 3.Y" errors
This ensures wheel installation selects and installs the correctly built
wheels that match the runtime Python version.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: complete Python 3.9 type annotation compatibility fixes
Fix remaining Python 3.9 incompatible type annotations throughout the
leann-core package that were causing test failures in CI. The union operator
(|) syntax for type hints was introduced in Python 3.10 and causes
"TypeError: unsupported operand type(s) for |" errors in Python 3.9.
Changes:
- Convert dict[str, Any] | None to Optional[dict[str, Any]]
- Convert int | None to Optional[int]
- Convert subprocess.Popen | None to Optional[subprocess.Popen]
- Convert LeannBackendFactoryInterface | None to Optional[LeannBackendFactoryInterface]
- Add missing Optional imports to all affected files
This resolves all test failures related to type annotation syntax and ensures
compatibility with Python 3.9 as specified in pyproject.toml.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: complete Python 3.9 type annotation fixes in backend packages
Fix remaining Python 3.9 incompatible type annotations in backend packages
that were causing test failures. The union operator (|) syntax for type hints
was introduced in Python 3.10 and causes "TypeError: unsupported operand
type(s) for |" errors in Python 3.9.
Changes in leann-backend-diskann:
- Convert zmq_port: int | None to Optional[int] in diskann_backend.py
- Convert passages_file: str | None to Optional[str] in diskann_embedding_server.py
- Add Optional imports to both files
Changes in leann-backend-hnsw:
- Convert zmq_port: int | None to Optional[int] in hnsw_backend.py
- Add Optional import
This resolves the final test failures related to type annotation syntax and
ensures full Python 3.9 compatibility across all packages.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: remove Python 3.10+ zip strict parameter for Python 3.9 compatibility
Remove the strict=False parameter from zip() call in api.py as it was
introduced in Python 3.10 and causes "TypeError: zip() takes no keyword
arguments" in Python 3.9.
The strict parameter controls whether zip() raises an exception when the
iterables have different lengths. Since we're not relying on this behavior
and the code works correctly without it, removing it maintains the same
functionality while ensuring Python 3.9 compatibility.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: ensure leann-core package is built on all platforms, not just Ubuntu
This fixes the issue where CI was installing leann-core from PyPI instead of
using locally built package with Python 3.9 compatibility fixes.
* fix: build and install leann meta package on all platforms
The leann meta package is pure Python and platform-independent, so there's
no reason to restrict it to Ubuntu only. This ensures all platforms use
consistent local builds instead of falling back to PyPI versions.
* fix: restrict MLX dependencies to Apple Silicon Macs only
MLX framework only supports Apple Silicon (ARM64) Macs, not Intel x86_64.
Add platform_machine == 'arm64' condition to prevent installation failures
on Intel Macs (macos-13).
* cleanup: simplify CI configuration
- Remove debug step with non-existent 'uv pip debug' command
- Simplify wheel installation logic - let uv handle compatibility
- Use -e .[test] instead of manually listing all test dependencies
* fix: install backend wheels before meta packages
Install backend wheels first to ensure they're available when core/meta
packages are installed, preventing uv from trying to resolve backend
dependencies from PyPI.
* fix: use local leann-core when building backend packages
Add --find-links to backend builds to ensure they use the locally built
leann-core with fixed MLX dependencies instead of downloading from PyPI.
Also bump leann-core version to 0.2.8 to ensure clean dependency resolution.
* fix: use absolute path for find-links and upgrade backend version
- Use GITHUB_WORKSPACE for absolute path to ensure find-links works
- Upgrade leann-backend-hnsw to 0.2.8 to match leann-core version
* fix: use absolute path for find-links and upgrade backend version
- Use GITHUB_WORKSPACE for absolute path to ensure find-links works
- Upgrade leann-backend-hnsw to 0.2.8 to match leann-core version
* fix: correct version consistency for --find-links to work properly
- All packages now use version 0.2.7 consistently
- Backend packages can find exact leann-core==0.2.7 from local build
- This ensures --find-links works during CI builds instead of falling back to PyPI
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: revert all packages to consistent version 0.2.7
- This PR should not bump versions, only fix Intel Mac build
- Version bumps should be done in release_manual workflow
- All packages now use 0.2.7 consistently for --find-links to work
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: use --find-links during package installation to avoid PyPI MLX conflicts
- Backend wheels contain Requires-Dist: leann-core==0.2.7
- Without --find-links, uv resolves this from PyPI which has MLX for all Darwin
- With --find-links, uv uses local leann-core with proper platform restrictions
- Root cause: dependency resolution happens at install time, not just build time
- Local test confirms this fixes Intel Mac MLX dependency issues
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: restrict MLX dependencies to ARM64 Macs in workspace pyproject.toml
- Root pyproject.toml also had MLX dependencies without platform_machine restriction
- This caused test dependency installation to fail on Intel Macs
- Now consistent with packages/leann-core/pyproject.toml platform restrictions
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: cleanup unused files and fix GitHub Actions warnings
- Remove unused packages/leann-backend-diskann/CMakeLists.txt
(DiskANN uses cmake.source-dir=third_party/DiskANN instead)
- Replace macos-latest with macos-14 to avoid migration warnings
(macos-latest will migrate to macOS 15 on August 4, 2025)
- Keep packages/leann-backend-hnsw/CMakeLists.txt (needed for Faiss config)
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: properly handle Python 3.13 support with PyTorch compatibility
- Support Python 3.13 on most platforms (Ubuntu, ARM64 Mac)
- Exclude Intel Mac + Python 3.13 combination due to PyTorch wheel availability
- PyTorch <2.5 supports Intel Mac but not Python 3.13
- PyTorch 2.5+ supports Python 3.13 but not Intel Mac x86_64
- Document limitation in CI configuration comments
- Update README badges with detailed Python version support and CI status
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Auto-enable debug mode for debug/clean-state-investigation branch
- Add more debug info to troubleshoot trigger issues
- This ensures tmate will start regardless of trigger method
The issue was that tmate was placed before pytest step, but the hang
occurs during pytest execution. Now tmate starts inside the test step
and provides connection info before running tests.
1. Tmate SSH Debugging:
- Added manual workflow_dispatch trigger with debug_enabled option
- Integrated mxschmitt/action-tmate@v3 for SSH access to CI runner
- Can be triggered manually or by adding [debug] to commit message
- Detached mode with 30min timeout, limited to actor only
- Also triggers on test failure when debug is enabled
2. Enhanced Pytest Output:
- Added --capture=no to see real-time output
- Added --log-cli-level=DEBUG for maximum verbosity
- Added --tb=short for cleaner tracebacks
- Pipe output to tee for both display and logging
- Show last 20 lines of output on completion
3. Environment Diagnostics:
- Export PYTHONUNBUFFERED=1 for immediate output
- Show Python/Pytest versions at start
- Display relevant environment variables
- Check network ports before/after tests
4. Diagnostic Script:
- Created scripts/diagnose_hang.sh for comprehensive system checks
- Shows processes, network, file descriptors, memory, ZMQ status
- Automatically runs on timeout for detailed debugging info
This allows debugging CI hangs via SSH when needed while providing extensive logging by default.
* feat: Add Ollama embedding support for local embedding models
* docs: Add clear documentation for Ollama embedding usage
* fix: remove leann_ask
* docs: remove ollama embedding extra instructions
* simplify MCP interface for Claude Code
- Remove unnecessary search parameters: search_mode, recompute_embeddings, file_types, min_score
- Remove leann_clear tool (not needed for Claude Code workflow)
- Streamline search to only use: query, index_name, top_k, complexity
- Keep core tools: leann_index, leann_search, leann_status, leann_list
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* remove leann_index from MCP interface
Users should use CLI command 'leann build' to create indexes first.
MCP now only provides search functionality:
- leann_search: search existing indexes
- leann_status: check index health
- leann_list: list available indexes
This separates index creation (CLI) from search (Claude Code).
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* improve CLI with auto project name and .gitignore support
- Make index_name optional, auto-use current directory name
- Read .gitignore patterns and respect them during indexing
- Add _read_gitignore_patterns() to parse .gitignore files
- Add _should_exclude_file() for pattern matching
- Apply exclusion patterns to both PDF and general file processing
- Show helpful messages about gitignore usage
Now users can simply run: leann build
And it will use project name + respect .gitignore patterns.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* feat: Add Ollama embedding support for local embedding models
* docs: Add clear documentation for Ollama embedding usage
* feat: Enhance Ollama embedding with better error handling and concurrent processing
- Add intelligent model validation and suggestions (inspired by OllamaChat)
- Implement concurrent processing for better performance
- Add retry mechanism with timeout handling
- Provide user-friendly error messages with emojis
- Auto-detect and recommend embedding models
- Add text truncation for long texts
- Improve progress bar display logic
* docs: don't mention it in README
- Add 'simulated' to the LLM choices in base_rag_example.py
- Handle simulated case in get_llm_config() method
- This allows tests to use --llm simulated to avoid API costs
- Improve grammar and sentence structure in MCP section
- Add proper markdown image formatting with relative paths
- Optimize mcp_leann.png size (1.3MB -> 224KB)
- Update data description to be more specific about Chinese content
- Add flush=True to all print statements in convert_to_csr.py to prevent buffer deadlock
- Redirect embedding server stdout/stderr to DEVNULL in CI environment (CI=true)
- Fix timeout in embedding_server_manager.stop_server() final wait call