Fix remaining Python 3.9 incompatible type annotations in backend packages
that were causing test failures. The union operator (|) syntax for type hints
was introduced in Python 3.10 and causes "TypeError: unsupported operand
type(s) for |" errors in Python 3.9.
Changes in leann-backend-diskann:
- Convert zmq_port: int | None to Optional[int] in diskann_backend.py
- Convert passages_file: str | None to Optional[str] in diskann_embedding_server.py
- Add Optional imports to both files
Changes in leann-backend-hnsw:
- Convert zmq_port: int | None to Optional[int] in hnsw_backend.py
- Add Optional import
This resolves the final test failures related to type annotation syntax and
ensures full Python 3.9 compatibility across all packages.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Build system improvements:
- Simplify macOS environment detection using brew --prefix
- Remove complex hardcoded paths and CMAKE_ARGS
- Let CMake automatically find Homebrew packages via CMAKE_PREFIX_PATH
- Clean separation between Intel (/usr/local) and Apple Silicon (/opt/homebrew)
Python 3.9 compatibility:
- Set ruff target-version to py39 to match project requirements
- Replace str | None with Union[str, None] in type annotations
- Add Union imports where needed
- Fix core interface, CLI, chat, and embedding server files
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: Add Ollama embedding support for local embedding models
* docs: Add clear documentation for Ollama embedding usage
* feat: Enhance Ollama embedding with better error handling and concurrent processing
- Add intelligent model validation and suggestions (inspired by OllamaChat)
- Implement concurrent processing for better performance
- Add retry mechanism with timeout handling
- Provide user-friendly error messages with emojis
- Auto-detect and recommend embedding models
- Add text truncation for long texts
- Improve progress bar display logic
* docs: don't mention it in README
* fix: auto-detect normalized embeddings and use cosine distance
- Add automatic detection for normalized embedding models (OpenAI, Voyage AI, Cohere)
- Automatically set distance_metric='cosine' for normalized embeddings
- Add warnings when using non-optimal distance metrics
- Implement manual L2 normalization in HNSW backend (custom Faiss build lacks normalize_L2)
- Fix DiskANN zmq_port compatibility with lazy loading strategy
- Add documentation for normalized embeddings feature
This fixes the low accuracy issue when using OpenAI text-embedding-3-small model with default MIPS metric.
* style: format
* feat: add OpenAI embeddings support to google_history_reader_leann.py
- Add --embedding-model and --embedding-mode arguments
- Support automatic detection of normalized embeddings
- Works correctly with cosine distance for OpenAI embeddings
* feat: add --use-existing-index option to google_history_reader_leann.py
- Allow using existing index without rebuilding
- Useful for testing pre-built indices
* fix: Improve OpenAI embeddings handling in HNSW backend
* fix: auto-detect normalized embeddings and use cosine distance
- Add automatic detection for normalized embedding models (OpenAI, Voyage AI, Cohere)
- Automatically set distance_metric='cosine' for normalized embeddings
- Add warnings when using non-optimal distance metrics
- Implement manual L2 normalization in HNSW backend (custom Faiss build lacks normalize_L2)
- Fix DiskANN zmq_port compatibility with lazy loading strategy
- Add documentation for normalized embeddings feature
This fixes the low accuracy issue when using OpenAI text-embedding-3-small model with default MIPS metric.
* style: format
- Fix ambiguous fullwidth characters (commas, parentheses) in strings and comments
- Replace Chinese comments with English equivalents
- Fix unused imports with proper noqa annotations for intentional imports
- Fix bare except clauses with specific exception types
- Fix redefined variables and undefined names
- Add ruff noqa annotations for generated protobuf files
- Add lint and format check to GitHub Actions CI pipeline
- Convert relative paths to absolute paths based on metadata file location
- Fixes FileNotFoundError when starting embedding server
- Resolves issue with passages file not found in different working directories
* chore: shorter build time
* chore: update faiss
* fix: no longger do embedding server reuse
* fix: do not reuse emb_server and close it properly
* feat: cli tool
* feat: cli more args
* fix: same embedding logic
* fix: diskann zmq port and passages
* feat: auto discovery of packages and fix passage gen for diskann
* docs: embedding pruning
* refactor: passage structure
* feat: reproducible research datas, rpj_wiki & dpr
* refactor: chat and base searcher
* feat: chat on mps