Compare commits

...

43 Commits

Author SHA1 Message Date
Andy Lee
79ec7d1aee fix: properly handle Python 3.13 support with PyTorch compatibility
- Support Python 3.13 on most platforms (Ubuntu, ARM64 Mac)
- Exclude Intel Mac + Python 3.13 combination due to PyTorch wheel availability
- PyTorch <2.5 supports Intel Mac but not Python 3.13
- PyTorch 2.5+ supports Python 3.13 but not Intel Mac x86_64
- Document limitation in CI configuration comments
- Update README badges with detailed Python version support and CI status

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 16:38:24 -07:00
Andy Lee
288d3c4e75 chore: cleanup unused files and fix GitHub Actions warnings
- Remove unused packages/leann-backend-diskann/CMakeLists.txt
  (DiskANN uses cmake.source-dir=third_party/DiskANN instead)
- Replace macos-latest with macos-14 to avoid migration warnings
  (macos-latest will migrate to macOS 15 on August 4, 2025)
- Keep packages/leann-backend-hnsw/CMakeLists.txt (needed for Faiss config)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 16:16:26 -07:00
Andy Lee
7e554b2ba2 fix: restrict MLX dependencies to ARM64 Macs in workspace pyproject.toml
- Root pyproject.toml also had MLX dependencies without platform_machine restriction
- This caused test dependency installation to fail on Intel Macs
- Now consistent with packages/leann-core/pyproject.toml platform restrictions

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 15:39:28 -07:00
Andy Lee
afd48d5901 fix: use --find-links during package installation to avoid PyPI MLX conflicts
- Backend wheels contain Requires-Dist: leann-core==0.2.7
- Without --find-links, uv resolves this from PyPI which has MLX for all Darwin
- With --find-links, uv uses local leann-core with proper platform restrictions
- Root cause: dependency resolution happens at install time, not just build time
- Local test confirms this fixes Intel Mac MLX dependency issues

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 15:24:52 -07:00
Andy Lee
36083dbf0f fix: revert all packages to consistent version 0.2.7
- This PR should not bump versions, only fix Intel Mac build
- Version bumps should be done in release_manual workflow
- All packages now use 0.2.7 consistently for --find-links to work

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 14:56:30 -07:00
Andy Lee
f819dacbb4 Merge commit '6762e5b' into fix-mac-intel-build 2025-08-11 14:39:18 -07:00
Andy Lee
6762e5b2c4 fix: correct version consistency for --find-links to work properly
- All packages now use version 0.2.7 consistently
- Backend packages can find exact leann-core==0.2.7 from local build
- This ensures --find-links works during CI builds instead of falling back to PyPI

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 14:38:58 -07:00
Andy Lee
0797008a3f fix: use absolute path for find-links and upgrade backend version
- Use GITHUB_WORKSPACE for absolute path to ensure find-links works
- Upgrade leann-backend-hnsw to 0.2.8 to match leann-core version
2025-08-11 14:24:55 -07:00
Andy Lee
b835eb821e fix: use absolute path for find-links and upgrade backend version
- Use GITHUB_WORKSPACE for absolute path to ensure find-links works
- Upgrade leann-backend-hnsw to 0.2.8 to match leann-core version
2025-08-11 14:24:12 -07:00
Andy Lee
a0c790f285 fix: use local leann-core when building backend packages
Add --find-links to backend builds to ensure they use the locally built
leann-core with fixed MLX dependencies instead of downloading from PyPI.

Also bump leann-core version to 0.2.8 to ensure clean dependency resolution.
2025-08-11 13:38:03 -07:00
Andy Lee
b7516608ab fix: install backend wheels before meta packages
Install backend wheels first to ensure they're available when core/meta
packages are installed, preventing uv from trying to resolve backend
dependencies from PyPI.
2025-08-11 10:03:27 -07:00
Andy Lee
1bd1238db6 cleanup: simplify CI configuration
- Remove debug step with non-existent 'uv pip debug' command
- Simplify wheel installation logic - let uv handle compatibility
- Use -e .[test] instead of manually listing all test dependencies
2025-08-11 01:54:54 -07:00
Andy Lee
b5c80edb03 Merge branch 'main' into fix-mac-intel-build 2025-08-11 01:54:29 -07:00
Andy Lee
430969565e fix: restrict MLX dependencies to Apple Silicon Macs only
MLX framework only supports Apple Silicon (ARM64) Macs, not Intel x86_64.
Add platform_machine == 'arm64' condition to prevent installation failures
on Intel Macs (macos-13).
2025-08-11 01:53:04 -07:00
Andy Lee
578a89d180 fix: build and install leann meta package on all platforms
The leann meta package is pure Python and platform-independent, so there's
no reason to restrict it to Ubuntu only. This ensures all platforms use
consistent local builds instead of falling back to PyPI versions.
2025-08-11 01:40:55 -07:00
Andy Lee
068fb38bae fix: ensure leann-core package is built on all platforms, not just Ubuntu
This fixes the issue where CI was installing leann-core from PyPI instead of
using locally built package with Python 3.9 compatibility fixes.
2025-08-11 01:28:12 -07:00
Andy Lee
6aa1a97a07 fix: remove Python 3.10+ zip strict parameter for Python 3.9 compatibility
Remove the strict=False parameter from zip() call in api.py as it was
introduced in Python 3.10 and causes "TypeError: zip() takes no keyword
arguments" in Python 3.9.

The strict parameter controls whether zip() raises an exception when the
iterables have different lengths. Since we're not relying on this behavior
and the code works correctly without it, removing it maintains the same
functionality while ensuring Python 3.9 compatibility.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 00:47:43 -07:00
Andy Lee
3a1cb49e20 fix: complete Python 3.9 type annotation fixes in backend packages
Fix remaining Python 3.9 incompatible type annotations in backend packages
that were causing test failures. The union operator (|) syntax for type hints
was introduced in Python 3.10 and causes "TypeError: unsupported operand
type(s) for |" errors in Python 3.9.

Changes in leann-backend-diskann:
- Convert zmq_port: int | None to Optional[int] in diskann_backend.py
- Convert passages_file: str | None to Optional[str] in diskann_embedding_server.py
- Add Optional imports to both files

Changes in leann-backend-hnsw:
- Convert zmq_port: int | None to Optional[int] in hnsw_backend.py
- Add Optional import

This resolves the final test failures related to type annotation syntax and
ensures full Python 3.9 compatibility across all packages.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 23:58:31 -07:00
GitHub Actions
239e35e2e6 chore: release v0.2.7 2025-08-11 03:11:46 +00:00
Andy Lee
2fac0c6fbf fix: improve gitignore and Jupyter notebook support (#28)
- Add nbconvert dependency for .ipynb file support
- Replace manual gitignore parsing with gitignore-parser library
- Proper recursive .gitignore handling (all subdirectories)
- Fix compliance with Git gitignore behavior
- Simplify code and improve reliability

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-10 20:02:46 -07:00
Andy Lee
fe9381fc8b fix: complete Python 3.9 type annotation compatibility fixes
Fix remaining Python 3.9 incompatible type annotations throughout the
leann-core package that were causing test failures in CI. The union operator
(|) syntax for type hints was introduced in Python 3.10 and causes
"TypeError: unsupported operand type(s) for |" errors in Python 3.9.

Changes:
- Convert dict[str, Any] | None to Optional[dict[str, Any]]
- Convert int | None to Optional[int]
- Convert subprocess.Popen | None to Optional[subprocess.Popen]
- Convert LeannBackendFactoryInterface | None to Optional[LeannBackendFactoryInterface]
- Add missing Optional imports to all affected files

This resolves all test failures related to type annotation syntax and ensures
compatibility with Python 3.9 as specified in pyproject.toml.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 18:48:10 -07:00
Andy Lee
037aad0870 fix: ensure virtual environment uses correct Python version in CI
Fix issue where uv venv was creating virtual environments with a different
Python version than specified in the matrix, causing wheel compatibility
errors. The problem occurred when the system had multiple Python versions
and uv venv defaulted to a different version than intended.

Changes:
- Add --python ${{ matrix.python }} flag to uv venv command
- Ensures virtual environment matches the matrix-specified Python version
- Fixes "The wheel is compatible with CPython 3.X but you're using CPython 3.Y" errors

This ensures wheel installation selects and installs the correctly built
wheels that match the runtime Python version.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 17:47:44 -07:00
Andy Lee
ded0decd17 fix: resolve wheel installation conflicts in CI matrix builds
Fix issue where multiple Python versions' wheels in the same dist directory
caused installation conflicts during CI testing. The problem occurred when
matrix builds for different Python versions accumulated wheels in shared
directories, and uv pip install would find incompatible wheels.

Changes:
- Add Python version detection using matrix.python variable
- Convert Python version to wheel tag format (e.g., 3.11 -> cp311)
- Use find with version-specific pattern matching to select correct wheels
- Add explicit error handling if no matching wheel is found

This ensures each CI job installs only wheels compatible with its specific
Python version, preventing "A path dependency is incompatible with the
current platform" errors.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 17:31:46 -07:00
Andy Lee
5094e6800a fix: use correct Python version for wheel builds
- Replace --python python with --python ${{ matrix.python }}
- This ensures wheels are built for the correct Python version in each matrix job
- Fixes Python version mismatch where cp39 wheels were used in cp311 environments

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 17:20:24 -07:00
Andy Lee
f08132c525 debug: simplify wheel compatibility checking
- Fix YAML syntax error in debug step
- Use simpler approach to show platform tags and wheel names
- This will help identify platform tag compatibility issues

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 17:07:18 -07:00
Andy Lee
6ae9e0f4f9 fix: update DiskANN submodule with additional type cast fix
- Add missing type cast in DistanceFastL2::norm function SSE2 version
- Fixes const float* = const signed char* compilation error
- Ensures consistent type casting across all SIMD code paths
- Resolves template instantiation error for DistanceFastL2<int8_t>

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 15:09:01 -07:00
Andy Lee
2be39db799 fix: update Faiss submodule with override keyword fix 2025-08-10 02:01:49 -07:00
Andy Lee
756864d058 fix: update Faiss submodule with override keyword fix
- Add missing override keyword to IDSelectorModulo::is_member function
- Fixes C++ compilation warning that was treated as error due to -Werror flag
- Resolves "warning: 'is_member' overrides a member function but is not marked 'override'"
- Improves code conformance to modern C++ best practices

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 01:59:10 -07:00
Andy Lee
4fc8943ca7 fix: update DiskANN submodule with type cast fix for signed char templates
- Add missing type casts (float*)a and (float*)b in SSE2 version
- This matches the existing type casts in the AVX version
- Fixes compilation error when instantiating DistanceInnerProduct<int8_t>
- Resolves "cannot initialize const float* with const signed char*" error

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 01:39:16 -07:00
Andy Lee
1bc4bf06f0 fix: update DiskANN submodule with SIMD function name corrections
- Fix _mm128_loadu_ps to _mm_loadu_ps (and similar functions)
- This is a known issue in upstream DiskANN code where incorrect function names were used
- Resolves compilation errors on macOS Intel builds

References: Known DiskANN issue with SIMD intrinsics naming

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 01:27:04 -07:00
Andy Lee
8d1e04d7a1 fix: update DiskANN submodule with macOS Intel/Apple Silicon compatibility fixes
- Auto-detect Homebrew libomp path using OpenMP_ROOT environment variable
- Exclude mkl_set_num_threads on macOS (uses Accelerate framework instead of MKL)
- Fixes compilation on Intel Macs by using correct /usr/local paths

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 19:27:02 -07:00
Andy Lee
f009d2add3 fix: remove hardcoded /opt/homebrew paths from DiskANN CMake
- Auto-detect Homebrew libomp path using OpenMP_ROOT environment variable
- Fallback to CMAKE_PREFIX_PATH/opt/libomp if OpenMP_ROOT not set
- Final fallback to brew --prefix libomp for auto-detection
- Maintains backwards compatibility with old hardcoded path
- Fixes Intel Mac builds that were failing due to hardcoded Apple Silicon paths

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 18:30:42 -07:00
Andy Lee
1dfc2f3737 fix: configure CMake paths in pyproject.toml for proper Homebrew detection
- Add CMAKE_PREFIX_PATH and OpenMP_ROOT environment variable mapping in both backends
- Remove CMAKE_ARGS from GitHub Actions workflow (cleaner separation)
- Ensure scikit-build-core correctly uses environment variables for CMake configuration
- This should fix the hardcoded /opt/homebrew paths on Intel Macs

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 18:14:48 -07:00
Andy Lee
0543d61572 fix: ensure CMAKE_PREFIX_PATH is passed to backend builds
- Add CMAKE_ARGS with CMAKE_PREFIX_PATH and OpenMP_ROOT for both HNSW and DiskANN backends
- This ensures CMake can find Homebrew packages on both Intel (/usr/local) and Apple Silicon (/opt/homebrew)
- Fixes the issue where CMake was still looking for hardcoded paths instead of using detected ones

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 17:58:49 -07:00
Andy Lee
abcc1fed31 fix: type 2025-08-09 17:40:13 -07:00
Andy Lee
c1832765cd Merge branch 'main' into fix-mac-intel-build 2025-08-09 17:35:10 -07:00
Andy Lee
4a5db385f0 fix: clean build system and Python 3.9 compatibility
Build system improvements:
- Simplify macOS environment detection using brew --prefix
- Remove complex hardcoded paths and CMAKE_ARGS
- Let CMake automatically find Homebrew packages via CMAKE_PREFIX_PATH
- Clean separation between Intel (/usr/local) and Apple Silicon (/opt/homebrew)

Python 3.9 compatibility:
- Set ruff target-version to py39 to match project requirements
- Replace str | None with Union[str, None] in type annotations
- Add Union imports where needed
- Fix core interface, CLI, chat, and embedding server files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 17:27:00 -07:00
Andy Lee
5f5b97fb54 fix: add abseil include path to CPPFLAGS for both Intel and Apple Silicon
- Add -I/opt/homebrew/opt/abseil/include to CPPFLAGS for Apple Silicon
- Add -I/usr/local/opt/abseil/include to CPPFLAGS for Intel
- Fixes 'absl/log/absl_log.h' file not found by ensuring abseil headers are in compiler include path

Root cause: CMAKE_PREFIX_PATH alone wasn't sufficient - compiler needs explicit -I flags

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 17:16:40 -07:00
Andy Lee
754c9aaedd fix: add abseil library path for protobuf compilation on macOS
- Include abseil in CMAKE_PREFIX_PATH for both Intel and Apple Silicon Macs
- Add explicit absl_DIR CMake variable to help find abseil for protobuf
- Fixes 'absl/log/absl_log.h' file not found error during compilation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 16:55:47 -07:00
Andy Lee
1b01725dd1 fix: improve macOS build reliability with proper OpenMP path detection
- Add proper CMAKE_PREFIX_PATH and OpenMP_ROOT detection for both Intel and Apple Silicon Macs
- Set LDFLAGS and CPPFLAGS for all Homebrew packages to ensure CMake can find them
- Apply CMAKE_ARGS to both HNSW and DiskANN backends for consistent builds
- Fix hardcoded paths that caused build failures on Intel Macs (macos-13)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-09 16:48:26 -07:00
Andy Lee
a620c2077a fix: auto-detect Homebrew paths for both DiskANN and HNSW backends
- Fix DiskANN CMakeLists.txt path reference
- Add macOS environment variable detection for OpenMP_ROOT
- Support both Intel (/usr/local) and Apple Silicon (/opt/homebrew) paths
2025-08-09 23:25:00 +00:00
Andy Lee
e16c369bfb fix: auto-detect Homebrew path for Intel vs Apple Silicon Macs
This fixes the hardcoded /opt/homebrew path which only works on Apple
Silicon Macs. Intel Macs use /usr/local as the Homebrew prefix.
2025-08-09 23:16:41 +00:00
Andy Lee
368c587c4f ci: add Mac Intel (x86_64) build support 2025-08-09 22:37:50 +00:00
22 changed files with 4005 additions and 3623 deletions

View File

@@ -54,16 +54,26 @@ jobs:
python: '3.12' python: '3.12'
- os: ubuntu-22.04 - os: ubuntu-22.04
python: '3.13' python: '3.13'
- os: macos-latest - os: macos-14
python: '3.9' python: '3.9'
- os: macos-latest - os: macos-14
python: '3.10' python: '3.10'
- os: macos-latest - os: macos-14
python: '3.11' python: '3.11'
- os: macos-latest - os: macos-14
python: '3.12' python: '3.12'
- os: macos-latest - os: macos-14
python: '3.13' python: '3.13'
- os: macos-13
python: '3.9'
- os: macos-13
python: '3.10'
- os: macos-13
python: '3.11'
- os: macos-13
python: '3.12'
# Note: macos-13 + Python 3.13 excluded due to PyTorch compatibility
# (PyTorch 2.5+ supports Python 3.13 but not Intel Mac x86_64)
runs-on: ${{ matrix.os }} runs-on: ${{ matrix.os }}
steps: steps:
@@ -109,48 +119,59 @@ jobs:
uv pip install --system delocate uv pip install --system delocate
fi fi
- name: Set macOS environment variables
if: runner.os == 'macOS'
run: |
# Use brew --prefix to automatically detect Homebrew installation path
HOMEBREW_PREFIX=$(brew --prefix)
echo "HOMEBREW_PREFIX=${HOMEBREW_PREFIX}" >> $GITHUB_ENV
echo "OpenMP_ROOT=${HOMEBREW_PREFIX}/opt/libomp" >> $GITHUB_ENV
# Set CMAKE_PREFIX_PATH to let CMake find all packages automatically
echo "CMAKE_PREFIX_PATH=${HOMEBREW_PREFIX}" >> $GITHUB_ENV
# Set compiler flags for OpenMP (required for both backends)
echo "LDFLAGS=-L${HOMEBREW_PREFIX}/opt/libomp/lib" >> $GITHUB_ENV
echo "CPPFLAGS=-I${HOMEBREW_PREFIX}/opt/libomp/include" >> $GITHUB_ENV
- name: Build packages - name: Build packages
run: | run: |
# Build core (platform independent) # Build core (platform independent)
if [[ "${{ matrix.os }}" == ubuntu-* ]]; then cd packages/leann-core
cd packages/leann-core uv build
uv build cd ../..
cd ../..
fi
# Build HNSW backend # Build HNSW backend
cd packages/leann-backend-hnsw cd packages/leann-backend-hnsw
if [ "${{ matrix.os }}" == "macos-latest" ]; then if [[ "${{ matrix.os }}" == macos-* ]]; then
# Use system clang instead of homebrew LLVM for better compatibility # Use system clang for better compatibility
export CC=clang export CC=clang
export CXX=clang++ export CXX=clang++
export MACOSX_DEPLOYMENT_TARGET=11.0 export MACOSX_DEPLOYMENT_TARGET=11.0
uv build --wheel --python python uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
else else
uv build --wheel --python python uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
fi fi
cd ../.. cd ../..
# Build DiskANN backend # Build DiskANN backend
cd packages/leann-backend-diskann cd packages/leann-backend-diskann
if [ "${{ matrix.os }}" == "macos-latest" ]; then if [[ "${{ matrix.os }}" == macos-* ]]; then
# Use system clang instead of homebrew LLVM for better compatibility # Use system clang for better compatibility
export CC=clang export CC=clang
export CXX=clang++ export CXX=clang++
# DiskANN requires macOS 13.3+ for sgesdd_ LAPACK function # DiskANN requires macOS 13.3+ for sgesdd_ LAPACK function
export MACOSX_DEPLOYMENT_TARGET=13.3 export MACOSX_DEPLOYMENT_TARGET=13.3
uv build --wheel --python python uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
else else
uv build --wheel --python python uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
fi fi
cd ../.. cd ../..
# Build meta package (platform independent) # Build meta package (platform independent)
if [[ "${{ matrix.os }}" == ubuntu-* ]]; then cd packages/leann
cd packages/leann uv build
uv build cd ../..
cd ../..
fi
- name: Repair wheels (Linux) - name: Repair wheels (Linux)
if: runner.os == 'Linux' if: runner.os == 'Linux'
@@ -199,20 +220,18 @@ jobs:
echo "📦 Built packages:" echo "📦 Built packages:"
find packages/*/dist -name "*.whl" -o -name "*.tar.gz" | sort find packages/*/dist -name "*.whl" -o -name "*.tar.gz" | sort
- name: Install built packages for testing - name: Install built packages for testing
run: | run: |
# Create a virtual environment # Create a virtual environment with the correct Python version
uv venv uv venv --python ${{ matrix.python }}
source .venv/bin/activate || source .venv/Scripts/activate source .venv/bin/activate || source .venv/Scripts/activate
# Install the built wheels # Install packages using --find-links to prioritize local builds
# Use --find-links to let uv choose the correct wheel for the platform uv pip install --find-links packages/leann-core/dist --find-links packages/leann-backend-hnsw/dist --find-links packages/leann-backend-diskann/dist packages/leann-core/dist/*.whl || uv pip install --find-links packages/leann-core/dist packages/leann-core/dist/*.tar.gz
if [[ "${{ matrix.os }}" == ubuntu-* ]]; then uv pip install --find-links packages/leann-core/dist packages/leann-backend-hnsw/dist/*.whl
uv pip install leann-core --find-links packages/leann-core/dist uv pip install --find-links packages/leann-core/dist packages/leann-backend-diskann/dist/*.whl
uv pip install leann --find-links packages/leann/dist uv pip install packages/leann/dist/*.whl || uv pip install packages/leann/dist/*.tar.gz
fi
uv pip install leann-backend-hnsw --find-links packages/leann-backend-hnsw/dist
uv pip install leann-backend-diskann --find-links packages/leann-backend-diskann/dist
# Install test dependencies using extras # Install test dependencies using extras
uv pip install -e ".[test]" uv pip install -e ".[test]"

View File

@@ -3,10 +3,11 @@
</p> </p>
<p align="center"> <p align="center">
<img src="https://img.shields.io/badge/Python-3.9%2B-blue.svg" alt="Python 3.9+"> <img src="https://img.shields.io/badge/Python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue.svg" alt="Python Versions">
<img src="https://github.com/yichuan-w/LEANN/actions/workflows/build-and-publish.yml/badge.svg" alt="CI Status">
<img src="https://img.shields.io/badge/Platform-Ubuntu%20%7C%20macOS%20(ARM64%2FIntel)-lightgrey" alt="Platform">
<img src="https://img.shields.io/badge/License-MIT-green.svg" alt="MIT License"> <img src="https://img.shields.io/badge/License-MIT-green.svg" alt="MIT License">
<img src="https://img.shields.io/badge/Platform-Linux%20%7C%20macOS-lightgrey" alt="Platform"> <img src="https://img.shields.io/badge/MCP-Native%20Integration-blue" alt="MCP Integration">
<img src="https://img.shields.io/badge/MCP-Native%20Integration-blue?style=flat-square" alt="MCP Integration">
</p> </p>
<h2 align="center" tabindex="-1" class="heading-element" dir="auto"> <h2 align="center" tabindex="-1" class="heading-element" dir="auto">

View File

@@ -1,8 +0,0 @@
# packages/leann-backend-diskann/CMakeLists.txt (simplified version)
cmake_minimum_required(VERSION 3.20)
project(leann_backend_diskann_wrapper)
# Tell CMake to directly enter the DiskANN submodule and execute its own CMakeLists.txt
# DiskANN will handle everything itself, including compiling Python bindings
add_subdirectory(src/third_party/DiskANN)

View File

@@ -4,7 +4,7 @@ import os
import struct import struct
import sys import sys
from pathlib import Path from pathlib import Path
from typing import Any, Literal from typing import Any, Literal, Optional
import numpy as np import numpy as np
import psutil import psutil
@@ -259,7 +259,7 @@ class DiskannSearcher(BaseSearcher):
prune_ratio: float = 0.0, prune_ratio: float = 0.0,
recompute_embeddings: bool = False, recompute_embeddings: bool = False,
pruning_strategy: Literal["global", "local", "proportional"] = "global", pruning_strategy: Literal["global", "local", "proportional"] = "global",
zmq_port: int | None = None, zmq_port: Optional[int] = None,
batch_recompute: bool = False, batch_recompute: bool = False,
dedup_node_dis: bool = False, dedup_node_dis: bool = False,
**kwargs, **kwargs,

View File

@@ -10,6 +10,7 @@ import sys
import threading import threading
import time import time
from pathlib import Path from pathlib import Path
from typing import Optional
import numpy as np import numpy as np
import zmq import zmq
@@ -32,7 +33,7 @@ if not logger.handlers:
def create_diskann_embedding_server( def create_diskann_embedding_server(
passages_file: str | None = None, passages_file: Optional[str] = None,
zmq_port: int = 5555, zmq_port: int = 5555,
model_name: str = "sentence-transformers/all-mpnet-base-v2", model_name: str = "sentence-transformers/all-mpnet-base-v2",
embedding_mode: str = "sentence-transformers", embedding_mode: str = "sentence-transformers",

View File

@@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build"
[project] [project]
name = "leann-backend-diskann" name = "leann-backend-diskann"
version = "0.2.6" version = "0.2.7"
dependencies = ["leann-core==0.2.6", "numpy", "protobuf>=3.19.0"] dependencies = ["leann-core==0.2.7", "numpy", "protobuf>=3.19.0"]
[tool.scikit-build] [tool.scikit-build]
# Key: simplified CMake path # Key: simplified CMake path
@@ -17,3 +17,5 @@ editable.mode = "redirect"
cmake.build-type = "Release" cmake.build-type = "Release"
build.verbose = true build.verbose = true
build.tool-args = ["-j8"] build.tool-args = ["-j8"]
# Let CMake find packages via Homebrew prefix
cmake.define = {CMAKE_PREFIX_PATH = {env = "CMAKE_PREFIX_PATH"}, OpenMP_ROOT = {env = "OpenMP_ROOT"}}

View File

@@ -5,11 +5,20 @@ set(CMAKE_CXX_COMPILER_WORKS 1)
# Set OpenMP path for macOS # Set OpenMP path for macOS
if(APPLE) if(APPLE)
set(OpenMP_C_FLAGS "-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include") # Detect Homebrew installation path (Apple Silicon vs Intel)
set(OpenMP_CXX_FLAGS "-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include") if(EXISTS "/opt/homebrew/opt/libomp")
set(HOMEBREW_PREFIX "/opt/homebrew")
elseif(EXISTS "/usr/local/opt/libomp")
set(HOMEBREW_PREFIX "/usr/local")
else()
message(FATAL_ERROR "Could not find libomp installation. Please install with: brew install libomp")
endif()
set(OpenMP_C_FLAGS "-Xpreprocessor -fopenmp -I${HOMEBREW_PREFIX}/opt/libomp/include")
set(OpenMP_CXX_FLAGS "-Xpreprocessor -fopenmp -I${HOMEBREW_PREFIX}/opt/libomp/include")
set(OpenMP_C_LIB_NAMES "omp") set(OpenMP_C_LIB_NAMES "omp")
set(OpenMP_CXX_LIB_NAMES "omp") set(OpenMP_CXX_LIB_NAMES "omp")
set(OpenMP_omp_LIBRARY "/opt/homebrew/opt/libomp/lib/libomp.dylib") set(OpenMP_omp_LIBRARY "${HOMEBREW_PREFIX}/opt/libomp/lib/libomp.dylib")
# Force use of system libc++ to avoid version mismatch # Force use of system libc++ to avoid version mismatch
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++") set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++")

View File

@@ -2,7 +2,7 @@ import logging
import os import os
import shutil import shutil
from pathlib import Path from pathlib import Path
from typing import Any, Literal from typing import Any, Literal, Optional
import numpy as np import numpy as np
from leann.interface import ( from leann.interface import (
@@ -152,7 +152,7 @@ class HNSWSearcher(BaseSearcher):
self, self,
query: np.ndarray, query: np.ndarray,
top_k: int, top_k: int,
zmq_port: int | None = None, zmq_port: Optional[int] = None,
complexity: int = 64, complexity: int = 64,
beam_width: int = 1, beam_width: int = 1,
prune_ratio: float = 0.0, prune_ratio: float = 0.0,

View File

@@ -10,6 +10,7 @@ import sys
import threading import threading
import time import time
from pathlib import Path from pathlib import Path
from typing import Union
import msgpack import msgpack
import numpy as np import numpy as np
@@ -33,7 +34,7 @@ if not logger.handlers:
def create_hnsw_embedding_server( def create_hnsw_embedding_server(
passages_file: str | None = None, passages_file: Union[str, None] = None,
zmq_port: int = 5555, zmq_port: int = 5555,
model_name: str = "sentence-transformers/all-mpnet-base-v2", model_name: str = "sentence-transformers/all-mpnet-base-v2",
distance_metric: str = "mips", distance_metric: str = "mips",

View File

@@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build"
[project] [project]
name = "leann-backend-hnsw" name = "leann-backend-hnsw"
version = "0.2.6" version = "0.2.7"
description = "Custom-built HNSW (Faiss) backend for the Leann toolkit." description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
dependencies = [ dependencies = [
"leann-core==0.2.6", "leann-core==0.2.7",
"numpy", "numpy",
"pyzmq>=23.0.0", "pyzmq>=23.0.0",
"msgpack>=1.0.0", "msgpack>=1.0.0",
@@ -22,6 +22,8 @@ cmake.build-type = "Release"
build.verbose = true build.verbose = true
build.tool-args = ["-j8"] build.tool-args = ["-j8"]
# CMake definitions to optimize compilation # CMake definitions to optimize compilation and find Homebrew packages
[tool.scikit-build.cmake.define] [tool.scikit-build.cmake.define]
CMAKE_BUILD_PARALLEL_LEVEL = "8" CMAKE_BUILD_PARALLEL_LEVEL = "8"
CMAKE_PREFIX_PATH = {env = "CMAKE_PREFIX_PATH"}
OpenMP_ROOT = {env = "OpenMP_ROOT"}

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project] [project]
name = "leann-core" name = "leann-core"
version = "0.2.6" version = "0.2.7"
description = "Core API and plugin system for LEANN" description = "Core API and plugin system for LEANN"
readme = "README.md" readme = "README.md"
requires-python = ">=3.9" requires-python = ">=3.9"
@@ -31,8 +31,10 @@ dependencies = [
"PyPDF2>=3.0.0", "PyPDF2>=3.0.0",
"pymupdf>=1.23.0", "pymupdf>=1.23.0",
"pdfplumber>=0.10.0", "pdfplumber>=0.10.0",
"mlx>=0.26.3; sys_platform == 'darwin'", "nbconvert>=7.0.0", # For .ipynb file support
"mlx-lm>=0.26.0; sys_platform == 'darwin'", "gitignore-parser>=0.1.12", # For proper .gitignore handling
"mlx>=0.26.3; sys_platform == 'darwin' and platform_machine == 'arm64'",
"mlx-lm>=0.26.0; sys_platform == 'darwin' and platform_machine == 'arm64'",
] ]
[project.optional-dependencies] [project.optional-dependencies]

View File

@@ -10,7 +10,7 @@ import time
import warnings import warnings
from dataclasses import dataclass, field from dataclasses import dataclass, field
from pathlib import Path from pathlib import Path
from typing import Any, Literal from typing import Any, Literal, Optional
import numpy as np import numpy as np
@@ -33,7 +33,7 @@ def compute_embeddings(
model_name: str, model_name: str,
mode: str = "sentence-transformers", mode: str = "sentence-transformers",
use_server: bool = True, use_server: bool = True,
port: int | None = None, port: Optional[int] = None,
is_build=False, is_build=False,
) -> np.ndarray: ) -> np.ndarray:
""" """
@@ -157,12 +157,12 @@ class LeannBuilder:
self, self,
backend_name: str, backend_name: str,
embedding_model: str = "facebook/contriever", embedding_model: str = "facebook/contriever",
dimensions: int | None = None, dimensions: Optional[int] = None,
embedding_mode: str = "sentence-transformers", embedding_mode: str = "sentence-transformers",
**backend_kwargs, **backend_kwargs,
): ):
self.backend_name = backend_name self.backend_name = backend_name
backend_factory: LeannBackendFactoryInterface | None = BACKEND_REGISTRY.get(backend_name) backend_factory: Optional[LeannBackendFactoryInterface] = BACKEND_REGISTRY.get(backend_name)
if backend_factory is None: if backend_factory is None:
raise ValueError(f"Backend '{backend_name}' not found or not registered.") raise ValueError(f"Backend '{backend_name}' not found or not registered.")
self.backend_factory = backend_factory self.backend_factory = backend_factory
@@ -242,7 +242,7 @@ class LeannBuilder:
self.backend_kwargs = backend_kwargs self.backend_kwargs = backend_kwargs
self.chunks: list[dict[str, Any]] = [] self.chunks: list[dict[str, Any]] = []
def add_text(self, text: str, metadata: dict[str, Any] | None = None): def add_text(self, text: str, metadata: Optional[dict[str, Any]] = None):
if metadata is None: if metadata is None:
metadata = {} metadata = {}
passage_id = metadata.get("id", str(len(self.chunks))) passage_id = metadata.get("id", str(len(self.chunks)))
@@ -554,7 +554,7 @@ class LeannSearcher:
if "labels" in results and "distances" in results: if "labels" in results and "distances" in results:
logger.info(f" Processing {len(results['labels'][0])} passage IDs:") logger.info(f" Processing {len(results['labels'][0])} passage IDs:")
for i, (string_id, dist) in enumerate( for i, (string_id, dist) in enumerate(
zip(results["labels"][0], results["distances"][0], strict=False) zip(results["labels"][0], results["distances"][0])
): ):
try: try:
passage_data = self.passage_manager.get_passage(string_id) passage_data = self.passage_manager.get_passage(string_id)
@@ -592,7 +592,7 @@ class LeannChat:
def __init__( def __init__(
self, self,
index_path: str, index_path: str,
llm_config: dict[str, Any] | None = None, llm_config: Optional[dict[str, Any]] = None,
enable_warmup: bool = False, enable_warmup: bool = False,
**kwargs, **kwargs,
): ):
@@ -608,7 +608,7 @@ class LeannChat:
prune_ratio: float = 0.0, prune_ratio: float = 0.0,
recompute_embeddings: bool = True, recompute_embeddings: bool = True,
pruning_strategy: Literal["global", "local", "proportional"] = "global", pruning_strategy: Literal["global", "local", "proportional"] = "global",
llm_kwargs: dict[str, Any] | None = None, llm_kwargs: Optional[dict[str, Any]] = None,
expected_zmq_port: int = 5557, expected_zmq_port: int = 5557,
**search_kwargs, **search_kwargs,
): ):

View File

@@ -8,7 +8,7 @@ import difflib
import logging import logging
import os import os
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
from typing import Any from typing import Any, Optional
import torch import torch
@@ -311,7 +311,7 @@ def search_hf_models(query: str, limit: int = 10) -> list[str]:
def validate_model_and_suggest( def validate_model_and_suggest(
model_name: str, llm_type: str, host: str = "http://localhost:11434" model_name: str, llm_type: str, host: str = "http://localhost:11434"
) -> str | None: ) -> Optional[str]:
"""Validate model name and provide suggestions if invalid""" """Validate model name and provide suggestions if invalid"""
if llm_type == "ollama": if llm_type == "ollama":
available_models = check_ollama_models(host) available_models = check_ollama_models(host)
@@ -685,7 +685,7 @@ class HFChat(LLMInterface):
class OpenAIChat(LLMInterface): class OpenAIChat(LLMInterface):
"""LLM interface for OpenAI models.""" """LLM interface for OpenAI models."""
def __init__(self, model: str = "gpt-4o", api_key: str | None = None): def __init__(self, model: str = "gpt-4o", api_key: Optional[str] = None):
self.model = model self.model = model
self.api_key = api_key or os.getenv("OPENAI_API_KEY") self.api_key = api_key or os.getenv("OPENAI_API_KEY")
@@ -761,7 +761,7 @@ class SimulatedChat(LLMInterface):
return "This is a simulated answer from the LLM based on the retrieved context." return "This is a simulated answer from the LLM based on the retrieved context."
def get_llm(llm_config: dict[str, Any] | None = None) -> LLMInterface: def get_llm(llm_config: Optional[dict[str, Any]] = None) -> LLMInterface:
""" """
Factory function to get an LLM interface based on configuration. Factory function to get an LLM interface based on configuration.

View File

@@ -1,6 +1,7 @@
import argparse import argparse
import asyncio import asyncio
from pathlib import Path from pathlib import Path
from typing import Union
from llama_index.core import SimpleDirectoryReader from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter from llama_index.core.node_parser import SentenceSplitter
@@ -203,62 +204,36 @@ Examples:
with open(global_registry, "w") as f: with open(global_registry, "w") as f:
json.dump(projects, f, indent=2) json.dump(projects, f, indent=2)
def _read_gitignore_patterns(self, docs_dir: str) -> list[str]: def _build_gitignore_parser(self, docs_dir: str):
"""Read .gitignore file and return patterns for exclusion.""" """Build gitignore parser using gitignore-parser library."""
gitignore_path = Path(docs_dir) / ".gitignore" from gitignore_parser import parse_gitignore
patterns = []
# Add some essential patterns that should always be excluded # Try to parse the root .gitignore
essential_patterns = [ gitignore_path = Path(docs_dir) / ".gitignore"
".git",
".DS_Store",
]
patterns.extend(essential_patterns)
if gitignore_path.exists(): if gitignore_path.exists():
try: try:
with open(gitignore_path, encoding="utf-8") as f: # gitignore-parser automatically handles all subdirectory .gitignore files!
for line in f: matches = parse_gitignore(str(gitignore_path))
line = line.strip() print(f"📋 Loaded .gitignore from {docs_dir} (includes all subdirectories)")
# Skip empty lines and comments return matches
if line and not line.startswith("#"):
# Remove leading slash if present (make it relative)
if line.startswith("/"):
line = line[1:]
patterns.append(line)
print(
f"📋 Loaded {len(patterns) - len(essential_patterns)} patterns from .gitignore"
)
except Exception as e: except Exception as e:
print(f"Warning: Could not read .gitignore: {e}") print(f"Warning: Could not parse .gitignore: {e}")
else: else:
print("📋 No .gitignore found, using minimal exclusion patterns") print("📋 No .gitignore found")
return patterns # Fallback: basic pattern matching for essential files
essential_patterns = {".git", ".DS_Store", "__pycache__", "node_modules", ".venv", "venv"}
def _should_exclude_file(self, relative_path: Path, exclude_patterns: list[str]) -> bool: def basic_matches(file_path):
"""Check if a file should be excluded based on gitignore-style patterns.""" path_parts = Path(file_path).parts
path_str = str(relative_path) return any(part in essential_patterns for part in path_parts)
for pattern in exclude_patterns: return basic_matches
# Simple pattern matching (could be enhanced with full gitignore syntax)
if pattern.endswith("*"):
# Wildcard pattern
prefix = pattern[:-1]
if path_str.startswith(prefix):
return True
elif "*" in pattern:
# Contains wildcard - simple glob-like matching
import fnmatch
if fnmatch.fnmatch(path_str, pattern): def _should_exclude_file(self, relative_path: Path, gitignore_matches) -> bool:
return True """Check if a file should be excluded using gitignore parser."""
else: return gitignore_matches(str(relative_path))
# Exact match or directory match
if path_str == pattern or path_str.startswith(pattern + "/"):
return True
return False
def list_indexes(self): def list_indexes(self):
print("Stored LEANN indexes:") print("Stored LEANN indexes:")
@@ -336,13 +311,13 @@ Examples:
print(f' leann search {example_name} "your query"') print(f' leann search {example_name} "your query"')
print(f" leann ask {example_name} --interactive") print(f" leann ask {example_name} --interactive")
def load_documents(self, docs_dir: str, custom_file_types: str | None = None): def load_documents(self, docs_dir: str, custom_file_types: Union[str, None] = None):
print(f"Loading documents from {docs_dir}...") print(f"Loading documents from {docs_dir}...")
if custom_file_types: if custom_file_types:
print(f"Using custom file types: {custom_file_types}") print(f"Using custom file types: {custom_file_types}")
# Read .gitignore patterns first # Build gitignore parser
exclude_patterns = self._read_gitignore_patterns(docs_dir) gitignore_matches = self._build_gitignore_parser(docs_dir)
# Try to use better PDF parsers first, but only if PDFs are requested # Try to use better PDF parsers first, but only if PDFs are requested
documents = [] documents = []
@@ -355,7 +330,7 @@ Examples:
for file_path in docs_path.rglob("*.pdf"): for file_path in docs_path.rglob("*.pdf"):
# Check if file matches any exclude pattern # Check if file matches any exclude pattern
relative_path = file_path.relative_to(docs_path) relative_path = file_path.relative_to(docs_path)
if self._should_exclude_file(relative_path, exclude_patterns): if self._should_exclude_file(relative_path, gitignore_matches):
continue continue
print(f"Processing PDF: {file_path}") print(f"Processing PDF: {file_path}")
@@ -449,14 +424,34 @@ Examples:
] ]
# Try to load other file types, but don't fail if none are found # Try to load other file types, but don't fail if none are found
try: try:
# Create a custom file filter function using our PathSpec
def file_filter(file_path: str) -> bool:
"""Return True if file should be included (not excluded)"""
try:
docs_path_obj = Path(docs_dir)
file_path_obj = Path(file_path)
relative_path = file_path_obj.relative_to(docs_path_obj)
return not self._should_exclude_file(relative_path, gitignore_matches)
except (ValueError, OSError):
return True # Include files that can't be processed
other_docs = SimpleDirectoryReader( other_docs = SimpleDirectoryReader(
docs_dir, docs_dir,
recursive=True, recursive=True,
encoding="utf-8", encoding="utf-8",
required_exts=code_extensions, required_exts=code_extensions,
exclude=exclude_patterns, file_extractor={}, # Use default extractors
filename_as_id=True,
).load_data(show_progress=True) ).load_data(show_progress=True)
documents.extend(other_docs)
# Filter documents after loading based on gitignore rules
filtered_docs = []
for doc in other_docs:
file_path = doc.metadata.get("file_path", "")
if file_filter(file_path):
filtered_docs.append(doc)
documents.extend(filtered_docs)
except ValueError as e: except ValueError as e:
if "No files found" in str(e): if "No files found" in str(e):
print("No additional files found for other supported types.") print("No additional files found for other supported types.")

View File

@@ -6,6 +6,7 @@ import subprocess
import sys import sys
import time import time
from pathlib import Path from pathlib import Path
from typing import Optional
import psutil import psutil
@@ -182,8 +183,8 @@ class EmbeddingServerManager:
e.g., "leann_backend_diskann.embedding_server" e.g., "leann_backend_diskann.embedding_server"
""" """
self.backend_module_name = backend_module_name self.backend_module_name = backend_module_name
self.server_process: subprocess.Popen | None = None self.server_process: Optional[subprocess.Popen] = None
self.server_port: int | None = None self.server_port: Optional[int] = None
self._atexit_registered = False self._atexit_registered = False
def start_server( def start_server(

View File

@@ -1,5 +1,5 @@
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
from typing import Any, Literal from typing import Any, Literal, Union
import numpy as np import numpy as np
@@ -34,7 +34,9 @@ class LeannBackendSearcherInterface(ABC):
pass pass
@abstractmethod @abstractmethod
def _ensure_server_running(self, passages_source_file: str, port: int | None, **kwargs) -> int: def _ensure_server_running(
self, passages_source_file: str, port: Union[int, None], **kwargs
) -> int:
"""Ensure server is running""" """Ensure server is running"""
pass pass
@@ -48,7 +50,7 @@ class LeannBackendSearcherInterface(ABC):
prune_ratio: float = 0.0, prune_ratio: float = 0.0,
recompute_embeddings: bool = False, recompute_embeddings: bool = False,
pruning_strategy: Literal["global", "local", "proportional"] = "global", pruning_strategy: Literal["global", "local", "proportional"] = "global",
zmq_port: int | None = None, zmq_port: Union[int, None] = None,
**kwargs, **kwargs,
) -> dict[str, Any]: ) -> dict[str, Any]:
"""Search for nearest neighbors """Search for nearest neighbors
@@ -74,7 +76,7 @@ class LeannBackendSearcherInterface(ABC):
self, self,
query: str, query: str,
use_server_if_available: bool = True, use_server_if_available: bool = True,
zmq_port: int | None = None, zmq_port: Union[int, None] = None,
) -> np.ndarray: ) -> np.ndarray:
"""Compute embedding for a query string """Compute embedding for a query string

View File

@@ -1,7 +1,7 @@
import json import json
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
from pathlib import Path from pathlib import Path
from typing import Any, Literal from typing import Any, Literal, Optional
import numpy as np import numpy as np
@@ -169,7 +169,7 @@ class BaseSearcher(LeannBackendSearcherInterface, ABC):
prune_ratio: float = 0.0, prune_ratio: float = 0.0,
recompute_embeddings: bool = False, recompute_embeddings: bool = False,
pruning_strategy: Literal["global", "local", "proportional"] = "global", pruning_strategy: Literal["global", "local", "proportional"] = "global",
zmq_port: int | None = None, zmq_port: Optional[int] = None,
**kwargs, **kwargs,
) -> dict[str, Any]: ) -> dict[str, Any]:
""" """

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project] [project]
name = "leann" name = "leann"
version = "0.2.6" version = "0.2.7"
description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!" description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
readme = "README.md" readme = "README.md"
requires-python = ">=3.9" requires-python = ">=3.9"

View File

@@ -32,7 +32,7 @@ dependencies = [
"pypdfium2>=4.30.0", "pypdfium2>=4.30.0",
# LlamaIndex core and readers - updated versions # LlamaIndex core and readers - updated versions
"llama-index>=0.12.44", "llama-index>=0.12.44",
"llama-index-readers-file>=0.4.0", # Essential for PDF parsing "llama-index-readers-file>=0.4.0", # Essential for PDF parsing
# "llama-index-readers-docling", # Requires Python >= 3.10 # "llama-index-readers-docling", # Requires Python >= 3.10
# "llama-index-node-parser-docling", # Requires Python >= 3.10 # "llama-index-node-parser-docling", # Requires Python >= 3.10
"llama-index-vector-stores-faiss>=0.4.0", "llama-index-vector-stores-faiss>=0.4.0",
@@ -40,9 +40,12 @@ dependencies = [
# Other dependencies # Other dependencies
"ipykernel==6.29.5", "ipykernel==6.29.5",
"msgpack>=1.1.1", "msgpack>=1.1.1",
"mlx>=0.26.3; sys_platform == 'darwin'", "mlx>=0.26.3; sys_platform == 'darwin' and platform_machine == 'arm64'",
"mlx-lm>=0.26.0; sys_platform == 'darwin'", "mlx-lm>=0.26.0; sys_platform == 'darwin' and platform_machine == 'arm64'",
"psutil>=5.8.0", "psutil>=5.8.0",
"pathspec>=0.12.1",
"nbconvert>=7.16.6",
"gitignore-parser>=0.1.12",
] ]
[project.optional-dependencies] [project.optional-dependencies]
@@ -88,7 +91,7 @@ leann-backend-diskann = { path = "packages/leann-backend-diskann", editable = tr
leann-backend-hnsw = { path = "packages/leann-backend-hnsw", editable = true } leann-backend-hnsw = { path = "packages/leann-backend-hnsw", editable = true }
[tool.ruff] [tool.ruff]
target-version = "py310" target-version = "py39"
line-length = 100 line-length = 100
extend-exclude = [ extend-exclude = [
"third_party", "third_party",

7318
uv.lock generated
View File

File diff suppressed because it is too large Load Diff