Compare commits

..

22 Commits

Author SHA1 Message Date
Andy Lee
971653fa1a upgrade: switch from manylinux2014 to manylinux_2_35
- Use manylinux_2_35 (GLIBC 2.35) instead of manylinux2014 (GLIBC 2.17)
- Still compatible with Google Colab (requires ≤2.35)
- Benefits: newer toolchain, better performance, modern C++ features
- Switch from yum to dnf package manager
- Remove pyzmq version cap as manylinux_2_35 has newer ZeroMQ
- Update documentation to reflect the change
2025-07-25 13:40:07 -07:00
Andy Lee
02672c040d update: DiskANN submodule to support both build environments
- Point to fix/python-finding-compatibility branch
- Support both Development and Development.Module for Python finding
- Fixes compatibility with both standard and manylinux builds
2025-07-25 13:33:36 -07:00
Andy Lee
f55108feda fix: resolve CI conflicts and Python finding issues
- Remove duplicate ci-cibuildwheel.yml workflow to avoid confusion
- Fix DiskANN CMakeLists.txt to support both standard and manylinux builds
- Try Development.Module first (manylinux), fallback to Development (standard)
- Keep build-reusable.yml and build-cibuildwheel.yml as separate workflows
- They build different wheel types and are selected at release time
2025-07-25 13:21:22 -07:00
Andy Lee
74d485c908 fix: remove Chinese comments from code
- Replace all Chinese comments with English
- Ensure code is internationalization-friendly
2025-07-25 12:55:28 -07:00
Andy Lee
d1fefb6378 fix: correct DiskANN subdirectory path
- Change from src/third_party/DiskANN to third_party/DiskANN
- This was causing CMake to fail finding the subdirectory
2025-07-25 12:51:47 -07:00
Andy Lee
732384f4f8 fix: improve Python finding for DiskANN in manylinux environment
- Add explicit Python finding logic to DiskANN CMakeLists.txt
- Change Development to Development.Module to avoid Embed requirement
- Pass Python variables through CMake cache to submodules
2025-07-25 12:22:27 -07:00
Andy Lee
ae38e10d1b chore: focus on Linux builds for Colab compatibility
- Remove macOS from build matrix to simplify CI
- Keep macOS configurations for future reference but they won't be used
- This speeds up CI and focuses on the primary target (Colab/Linux)
2025-07-25 12:20:32 -07:00
Andy Lee
ca0fd88934 chore: clean up temporary test scripts 2025-07-25 11:53:23 -07:00
Andy Lee
3c8d32f156 fix: prevent pyzmq compilation during tests
- Cap pyzmq version to <27 for manylinux2014 compatibility
- Pre-install pyzmq binary wheel before tests using CIBW_BEFORE_TEST
- Force pip to use only binary wheels with --only-binary :all:
2025-07-25 11:35:31 -07:00
Andy Lee
b8ff00fc6a fix: address macOS deployment target and pyzmq compilation issues
- Set MACOSX_DEPLOYMENT_TARGET=11.0 for macOS builds
- Add pyzmq to test requirements to use pre-built wheels
- Configure deployment target in both workflow and pyproject.toml
- Skip ARM64 tests on GitHub Actions to avoid cross-compilation issues
2025-07-25 11:13:51 -07:00
Andy Lee
3c836766f8 fix: update faiss submodule to fix-python-finding-manylinux branch 2025-07-25 10:55:28 -07:00
Andy Lee
b4a1dfb9c7 fix: update faiss submodule pointer with Python finding fixes 2025-07-25 10:53:05 -07:00
Andy Lee
a4d66e95d8 fix: improve Python finding for Faiss in manylinux environment
- Use Development.Module instead of Development component
- Find Python in main Faiss CMakeLists.txt before python subdirectory
- Add debug output to trace Python variable passing
- Set Python_FIND_VIRTUALENV=ONLY for Faiss
2025-07-25 10:52:34 -07:00
Andy Lee
cf58b3e31b fix: re-enable Faiss Python bindings and improve Python finding
- Re-enable FAISS_ENABLE_PYTHON since we need the Python bindings
- Use Development.Module component for better compatibility
- Pass Python information to Faiss through CMake cache variables
- Add CMAKE_PREFIX_PATH= to help CMake find Python in manylinux
2025-07-25 10:45:07 -07:00
Andy Lee
e9c2ca7936 fix: remove cmake.verbose from scikit-build config
- cmake.verbose is not allowed when minimum-version is set to 0.10 or higher
- This was causing build failures in cibuildwheel
2025-07-25 10:37:06 -07:00
Andy Lee
dab154a77b fix: disable Faiss Python bindings to avoid CMake Python finding issues
- Set FAISS_ENABLE_PYTHON to OFF since we use our own Cython bindings
- This avoids the CMake Python finding issues in manylinux environments
- Simplify CMakeLists.txt by removing unnecessary Python finding logic
- Keep swig installation for other potential uses
2025-07-25 10:33:31 -07:00
Andy Lee
13413dfae5 fix: improve Python detection in manylinux environment
- Modify faiss CMakeLists.txt to try both FindPython and FindPython3
- Add scikit-build configuration to help with Python detection
- Simplify Linux environment variables in cibuildwheel
- Add CMake Python detection before faiss configuration
2025-07-25 10:27:04 -07:00
Andy Lee
0543cc9816 fix: add missing dependencies for CI builds
- Add libomp for macOS to fix OpenMP linking error
- Add python3-devel for Linux (though may not be needed in manylinux)
- Install numpy before building to satisfy faiss CMake requirements
- Add before-build configuration to cibuildwheel
2025-07-25 10:20:02 -07:00
Andy Lee
fb53ed9a0e fix: use manylinux2014 for Colab compatibility
- Switch from manylinux_2_28 to manylinux2014 (provides manylinux_2_17)
- This should produce wheels compatible with manylinux_2_35_x86_64 requirement
- Update package manager from dnf to yum for CentOS 7
- Use cmake3 with symlink for compatibility
2025-07-25 10:15:05 -07:00
Andy Lee
015f43733a fix: adjust configuration for Colab compatibility
- Remove Windows support as not needed
- Move Python finding hints to cibuildwheel environment variables
- Keep pyproject.toml clean to avoid breaking normal builds
- Target manylinux_2_28 for better Colab compatibility
2025-07-25 10:09:21 -07:00
Andy Lee
2957c8bf5a docs: add manylinux build strategy documentation 2025-07-25 10:03:57 -07:00
Andy Lee
a73194c3f6 feat: simplify build using cibuildwheel with standard configuration
- Add Python_FIND_VIRTUALENV hints to pyproject.toml for CMake
- Create standardized cibuildwheel workflow using manylinux_2_28
- Simplify system dependency installation
- Add global cibuildwheel configuration in root pyproject.toml
- Create streamlined test workflow for manylinux compatibility
2025-07-25 10:03:23 -07:00
17 changed files with 623 additions and 105 deletions

171
.github/workflows/build-cibuildwheel.yml vendored Normal file
View File

@@ -0,0 +1,171 @@
name: Build with cibuildwheel
on:
workflow_call:
inputs:
ref:
description: 'Git ref to build'
required: false
type: string
default: ''
jobs:
build_wheels:
name: Build wheels on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest] # Focus on Linux/manylinux for Colab compatibility
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.ref }}
submodules: recursive
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
# Build pure Python packages separately
- name: Build pure Python packages (leann-core, leann)
if: matrix.os == 'ubuntu-latest' # Only build once
run: |
python -m pip install --upgrade pip build
python -m build packages/leann-core --outdir wheelhouse/
python -m build packages/leann --outdir wheelhouse/
- name: Build leann-backend-hnsw wheels
uses: pypa/cibuildwheel@v2.20.0
with:
package-dir: packages/leann-backend-hnsw
output-dir: wheelhouse
env:
CIBW_BUILD: cp39-* cp310-* cp311-* cp312-* cp313-*
CIBW_SKIP: "*-win32 *-manylinux_i686 pp* *musllinux*"
# Use manylinux_2_35 for Colab compatibility with modern features
CIBW_MANYLINUX_X86_64_IMAGE: manylinux_2_35
CIBW_MANYLINUX_AARCH64_IMAGE: manylinux_2_35
# Linux dependencies - using dnf for manylinux_2_35 (based on AlmaLinux 9)
CIBW_BEFORE_ALL_LINUX: |
dnf install -y epel-release
dnf install -y gcc-c++ boost-devel zeromq-devel openblas-devel cmake python3-devel
# Install numpy before building
CIBW_BEFORE_BUILD: |
pip install numpy
pip install --upgrade pip setuptools wheel
CIBW_BEFORE_BUILD_LINUX: |
pip install numpy
pip install --upgrade pip setuptools wheel swig
CIBW_BEFORE_ALL_MACOS: |
brew install boost zeromq openblas cmake libomp
# Pre-install test dependencies to avoid compilation
CIBW_BEFORE_TEST: |
pip install --only-binary :all: "pyzmq>=23.0.0"
# Test command to verify the wheel works
CIBW_TEST_COMMAND: |
python -c "import leann_backend_hnsw; print('HNSW backend imported successfully')"
# Skip problematic configurations
CIBW_TEST_SKIP: "*-macosx_arm64" # Skip ARM64 tests on GitHub Actions
# Test dependencies
CIBW_TEST_REQUIRES: "pytest numpy"
# Environment variables for build
CIBW_ENVIRONMENT: |
CMAKE_BUILD_PARALLEL_LEVEL=8
Python_FIND_VIRTUALENV=ONLY
Python3_FIND_VIRTUALENV=ONLY
# Linux-specific environment variables
CIBW_ENVIRONMENT_LINUX: |
CMAKE_BUILD_PARALLEL_LEVEL=8
# macOS-specific environment variables
CIBW_ENVIRONMENT_MACOS: |
CMAKE_BUILD_PARALLEL_LEVEL=8
MACOSX_DEPLOYMENT_TARGET=11.0
CMAKE_OSX_DEPLOYMENT_TARGET=11.0
Python_FIND_VIRTUALENV=ONLY
Python3_FIND_VIRTUALENV=ONLY
- name: Build leann-backend-diskann wheels
uses: pypa/cibuildwheel@v2.20.0
with:
package-dir: packages/leann-backend-diskann
output-dir: wheelhouse
env:
CIBW_BUILD: cp39-* cp310-* cp311-* cp312-* cp313-*
CIBW_SKIP: "*-win32 *-manylinux_i686 pp* *musllinux*"
CIBW_MANYLINUX_X86_64_IMAGE: manylinux_2_35
CIBW_MANYLINUX_AARCH64_IMAGE: manylinux_2_35
CIBW_BEFORE_ALL_LINUX: |
dnf install -y epel-release
dnf install -y gcc-c++ boost-devel zeromq-devel openblas-devel cmake python3-devel
# Install numpy before building
CIBW_BEFORE_BUILD: |
pip install numpy
pip install --upgrade pip setuptools wheel
CIBW_BEFORE_BUILD_LINUX: |
pip install numpy
pip install --upgrade pip setuptools wheel swig
CIBW_BEFORE_ALL_MACOS: |
brew install boost zeromq openblas cmake libomp
# Pre-install test dependencies to avoid compilation
CIBW_BEFORE_TEST: |
pip install --only-binary :all: "pyzmq>=23.0.0"
# Test command to verify the wheel works
CIBW_TEST_COMMAND: |
python -c "import leann_backend_diskann; print('DiskANN backend imported successfully')"
# Skip problematic configurations
CIBW_TEST_SKIP: "*-macosx_arm64" # Skip ARM64 tests on GitHub Actions
# Test dependencies - avoid pyzmq due to manylinux2014 compatibility issues
CIBW_TEST_REQUIRES: "pytest numpy"
CIBW_ENVIRONMENT: |
CMAKE_BUILD_PARALLEL_LEVEL=8
Python_FIND_VIRTUALENV=ONLY
Python3_FIND_VIRTUALENV=ONLY
# Linux-specific environment variables
CIBW_ENVIRONMENT_LINUX: |
CMAKE_BUILD_PARALLEL_LEVEL=8
CMAKE_PREFIX_PATH=$VIRTUAL_ENV
Python_FIND_VIRTUALENV=ONLY
Python3_FIND_VIRTUALENV=ONLY
Python_FIND_STRATEGY=LOCATION
Python3_FIND_STRATEGY=LOCATION
Python_EXECUTABLE=$VIRTUAL_ENV/bin/python
Python3_EXECUTABLE=$VIRTUAL_ENV/bin/python
# macOS-specific environment variables
CIBW_ENVIRONMENT_MACOS: |
CMAKE_BUILD_PARALLEL_LEVEL=8
MACOSX_DEPLOYMENT_TARGET=11.0
CMAKE_OSX_DEPLOYMENT_TARGET=11.0
Python_FIND_VIRTUALENV=ONLY
Python3_FIND_VIRTUALENV=ONLY
- uses: actions/upload-artifact@v4
with:
name: wheels-${{ matrix.os }}
path: ./wheelhouse/*.whl

View File

@@ -13,46 +13,107 @@ jobs:
build:
name: Build ${{ matrix.os }} Python ${{ matrix.python }}
strategy:
fail-fast: false
matrix:
include:
- os: ubuntu-22.04
- os: ubuntu-latest
python: '3.9'
- os: ubuntu-22.04
container: 'quay.io/pypa/manylinux2014_x86_64'
- os: ubuntu-latest
python: '3.10'
- os: ubuntu-22.04
container: 'quay.io/pypa/manylinux2014_x86_64'
- os: ubuntu-latest
python: '3.11'
- os: ubuntu-22.04
container: 'quay.io/pypa/manylinux2014_x86_64'
- os: ubuntu-latest
python: '3.12'
- os: ubuntu-22.04
container: 'quay.io/pypa/manylinux2014_x86_64'
- os: ubuntu-latest
python: '3.13'
container: 'quay.io/pypa/manylinux2014_x86_64'
- os: macos-latest
python: '3.9'
container: ''
- os: macos-latest
python: '3.10'
container: ''
- os: macos-latest
python: '3.11'
container: ''
- os: macos-latest
python: '3.12'
container: ''
- os: macos-latest
python: '3.13'
container: ''
runs-on: ${{ matrix.os }}
container: ${{ matrix.container }}
steps:
# For manylinux2014 compatibility, we'll handle checkout differently
- uses: actions/checkout@v4
if: matrix.container == ''
with:
ref: ${{ inputs.ref }}
submodules: recursive
- name: Setup Python
# Manual checkout for containers to avoid Node.js compatibility issues
- name: Manual checkout in container
if: matrix.container != ''
run: |
# Install git if not available
yum install -y git || true
# Configure git to handle the directory ownership issue
git config --global --add safe.directory ${GITHUB_WORKSPACE}
git config --global --add safe.directory /__w/LEANN/LEANN
git config --global --add safe.directory /github/workspace
git config --global --add safe.directory $(pwd)
# Clone the repository manually in the container
git init
git remote add origin https://github.com/${GITHUB_REPOSITORY}.git
# Fetch the appropriate ref
if [ -n "${{ inputs.ref }}" ]; then
git fetch --depth=1 origin ${{ inputs.ref }}
else
git fetch --depth=1 origin ${GITHUB_SHA}
fi
git checkout FETCH_HEAD
# Initialize submodules
git submodule update --init --recursive
- name: Setup Python (macOS and regular Ubuntu)
if: matrix.container == ''
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python }}
- name: Install uv
- name: Setup Python (manylinux container)
if: matrix.container != ''
run: |
# Use the pre-installed Python version in manylinux container
# Convert Python version format (3.9 -> 39, 3.10 -> 310, etc.)
PY_VER=$(echo "${{ matrix.python }}" | sed 's/\.//g')
/opt/python/cp${PY_VER}-*/bin/python -m pip install --upgrade pip
# Create symlinks for convenience
ln -sf /opt/python/cp${PY_VER}-*/bin/python /usr/local/bin/python
ln -sf /opt/python/cp${PY_VER}-*/bin/pip /usr/local/bin/pip
- name: Install uv (macOS and regular Ubuntu)
if: matrix.container == ''
uses: astral-sh/setup-uv@v4
- name: Install system dependencies (Ubuntu)
if: runner.os == 'Linux'
- name: Install uv (manylinux container)
if: matrix.container != ''
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Install system dependencies (Ubuntu - regular)
if: runner.os == 'Linux' && matrix.container == ''
run: |
sudo apt-get update
sudo apt-get install -y libomp-dev libboost-all-dev protobuf-compiler libzmq3-dev \
@@ -65,6 +126,64 @@ jobs:
echo "MKLROOT=/opt/intel/oneapi/mkl/latest" >> $GITHUB_ENV
echo "LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/latest/lib/intel64:$LD_LIBRARY_PATH" >> $GITHUB_ENV
- name: Install system dependencies (manylinux container)
if: runner.os == 'Linux' && matrix.container != ''
run: |
# manylinux2014 uses yum instead of apt
# Update yum cache first
yum clean all
yum makecache
# Install EPEL repository
yum install -y epel-release || true
# Update cache again after EPEL
yum makecache || true
# Install development packages
# Note: Some packages might have different names in CentOS 7
yum install -y \
gcc-c++ \
boost-devel \
protobuf-compiler \
protobuf-devel \
zeromq-devel \
pkgconfig \
openblas-devel \
cmake || {
echo "Some packages failed to install, trying alternatives..."
# Try alternative package names
yum install -y libzmq3-devel || true
yum install -y libzmq-devel || true
}
# Install optional packages that might not be available
yum install -y libaio-devel || echo "libaio-devel not available, continuing..."
# Verify zmq installation and create pkg-config file if needed
if [ ! -f /usr/lib64/pkgconfig/libzmq.pc ] && [ ! -f /usr/lib/pkgconfig/libzmq.pc ]; then
echo "Creating libzmq.pc file..."
mkdir -p /usr/lib64/pkgconfig
cat > /usr/lib64/pkgconfig/libzmq.pc << 'EOF'
prefix=/usr
exec_prefix=${prefix}
libdir=${exec_prefix}/lib64
includedir=${prefix}/include
Name: libzmq
Description: ZeroMQ library
Version: 4.1.4
Libs: -L${libdir} -lzmq
Cflags: -I${includedir}
EOF
fi
# Update PKG_CONFIG_PATH
echo "PKG_CONFIG_PATH=/usr/lib64/pkgconfig:/usr/lib/pkgconfig:$PKG_CONFIG_PATH" >> $GITHUB_ENV
# Build tools are pre-installed in manylinux
# MKL is more complex in container, skip for now and use OpenBLAS
- name: Install system dependencies (macOS)
if: runner.os == 'macOS'
run: |
@@ -72,44 +191,65 @@ jobs:
- name: Install build dependencies
run: |
uv pip install --system scikit-build-core numpy swig Cython pybind11
if [[ "$RUNNER_OS" == "Linux" ]]; then
uv pip install --system auditwheel
if [[ -n "${{ matrix.container }}" ]]; then
# In manylinux container, use regular pip
pip install scikit-build-core numpy swig Cython pybind11 auditwheel
else
uv pip install --system delocate
# Regular environment, use uv
uv pip install --system scikit-build-core numpy swig Cython pybind11
if [[ "$RUNNER_OS" == "Linux" ]]; then
uv pip install --system auditwheel
else
uv pip install --system delocate
fi
fi
- name: Build packages
run: |
# Choose build command based on environment
if [[ -n "${{ matrix.container }}" ]]; then
BUILD_CMD="pip wheel . --no-deps -w dist"
else
BUILD_CMD="uv build --wheel --python python"
fi
# Build core (platform independent)
if [[ "${{ matrix.os }}" == ubuntu-* ]]; then
if [ "${{ matrix.os }}" == "ubuntu-latest" ]; then
cd packages/leann-core
uv build
if [[ -n "${{ matrix.container }}" ]]; then
pip wheel . --no-deps -w dist
else
uv build
fi
cd ../..
fi
# Build HNSW backend
cd packages/leann-backend-hnsw
if [ "${{ matrix.os }}" == "macos-latest" ]; then
CC=$(brew --prefix llvm)/bin/clang CXX=$(brew --prefix llvm)/bin/clang++ uv build --wheel --python python
CC=$(brew --prefix llvm)/bin/clang CXX=$(brew --prefix llvm)/bin/clang++ $BUILD_CMD
else
uv build --wheel --python python
eval $BUILD_CMD
fi
cd ../..
# Build DiskANN backend
cd packages/leann-backend-diskann
if [ "${{ matrix.os }}" == "macos-latest" ]; then
CC=$(brew --prefix llvm)/bin/clang CXX=$(brew --prefix llvm)/bin/clang++ uv build --wheel --python python
CC=$(brew --prefix llvm)/bin/clang CXX=$(brew --prefix llvm)/bin/clang++ $BUILD_CMD
else
uv build --wheel --python python
eval $BUILD_CMD
fi
cd ../..
# Build meta package (platform independent)
if [[ "${{ matrix.os }}" == ubuntu-* ]]; then
if [ "${{ matrix.os }}" == "ubuntu-latest" ]; then
cd packages/leann
uv build
if [[ -n "${{ matrix.container }}" ]]; then
pip wheel . --no-deps -w dist
else
uv build
fi
cd ../..
fi
@@ -119,6 +259,9 @@ jobs:
# Repair HNSW wheel
cd packages/leann-backend-hnsw
if [ -d dist ]; then
# Show what platform auditwheel will use
auditwheel show dist/*.whl || true
# Let auditwheel auto-detect the appropriate manylinux tag
auditwheel repair dist/*.whl -w dist_repaired
rm -rf dist
mv dist_repaired dist
@@ -128,6 +271,9 @@ jobs:
# Repair DiskANN wheel
cd packages/leann-backend-diskann
if [ -d dist ]; then
# Show what platform auditwheel will use
auditwheel show dist/*.whl || true
# Let auditwheel auto-detect the appropriate manylinux tag
auditwheel repair dist/*.whl -w dist_repaired
rm -rf dist
mv dist_repaired dist
@@ -163,5 +309,5 @@ jobs:
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: packages-${{ matrix.os }}-py${{ matrix.python }}
name: packages-${{ matrix.os }}-py${{ matrix.python }}${{ matrix.container && '-manylinux' || '' }}
path: packages/*/dist/

View File

@@ -7,6 +7,11 @@ on:
description: 'Version to release (e.g., 0.1.2)'
required: true
type: string
use_cibuildwheel:
description: 'Use cibuildwheel for better compatibility (recommended for Colab)'
required: false
type: boolean
default: false
jobs:
update-version:
@@ -31,38 +36,37 @@ jobs:
- name: Update versions and push
id: push
run: |
# Check current version
CURRENT_VERSION=$(grep "^version" packages/leann-core/pyproject.toml | cut -d'"' -f2)
echo "Current version: $CURRENT_VERSION"
echo "Target version: ${{ inputs.version }}"
if [ "$CURRENT_VERSION" = "${{ inputs.version }}" ]; then
echo "⚠️ Version is already ${{ inputs.version }}, skipping update"
COMMIT_SHA=$(git rev-parse HEAD)
else
./scripts/bump_version.sh ${{ inputs.version }}
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
git add packages/*/pyproject.toml
git commit -m "chore: release v${{ inputs.version }}"
git push origin main
COMMIT_SHA=$(git rev-parse HEAD)
echo "✅ Pushed version update: $COMMIT_SHA"
fi
./scripts/bump_version.sh ${{ inputs.version }}
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
git add packages/*/pyproject.toml
git commit -m "chore: release v${{ inputs.version }}"
git push origin main
COMMIT_SHA=$(git rev-parse HEAD)
echo "commit-sha=$COMMIT_SHA" >> $GITHUB_OUTPUT
echo "✅ Pushed version update: $COMMIT_SHA"
build-packages:
name: Build packages
build-packages-reusable:
name: Build packages (Standard)
needs: update-version
if: ${{ !inputs.use_cibuildwheel }}
uses: ./.github/workflows/build-reusable.yml
with:
ref: ${{ needs.update-version.outputs.commit-sha }}
build-packages-cibuildwheel:
name: Build packages (cibuildwheel)
needs: update-version
if: ${{ inputs.use_cibuildwheel }}
uses: ./.github/workflows/build-cibuildwheel.yml
with:
ref: ${{ needs.update-version.outputs.commit-sha }}
publish:
name: Publish and Release
needs: [update-version, build-packages]
if: always() && needs.update-version.result == 'success' && needs.build-packages.result == 'success'
needs: [update-version, build-packages-reusable, build-packages-cibuildwheel]
if: always() && needs.update-version.result == 'success' && (needs.build-packages-reusable.result == 'success' || needs.build-packages-cibuildwheel.result == 'success')
runs-on: ubuntu-latest
permissions:
contents: write
@@ -103,24 +107,12 @@ jobs:
- name: Create release
run: |
# Check if tag already exists
if git rev-parse "v${{ inputs.version }}" >/dev/null 2>&1; then
echo "⚠️ Tag v${{ inputs.version }} already exists, skipping tag creation"
else
git tag "v${{ inputs.version }}"
git push origin "v${{ inputs.version }}"
echo "✅ Created and pushed tag v${{ inputs.version }}"
fi
git tag "v${{ inputs.version }}"
git push origin "v${{ inputs.version }}"
# Check if release already exists
if gh release view "v${{ inputs.version }}" >/dev/null 2>&1; then
echo "⚠️ Release v${{ inputs.version }} already exists, skipping release creation"
else
gh release create "v${{ inputs.version }}" \
--title "Release v${{ inputs.version }}" \
--notes "🚀 Released to PyPI: https://pypi.org/project/leann/${{ inputs.version }}/" \
--latest
echo "✅ Created GitHub release v${{ inputs.version }}"
fi
gh release create "v${{ inputs.version }}" \
--title "Release v${{ inputs.version }}" \
--notes "🚀 Released to PyPI: https://pypi.org/project/leann/${{ inputs.version }}/" \
--latest
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

60
.github/workflows/test-manylinux.yml vendored Normal file
View File

@@ -0,0 +1,60 @@
name: Test Manylinux Build
on:
workflow_dispatch:
pull_request:
branches: [ main ]
paths:
- '.github/workflows/**'
- 'packages/**'
- 'pyproject.toml'
push:
branches:
- 'fix/manylinux-*'
- 'test/build-*'
jobs:
build:
uses: ./.github/workflows/build-cibuildwheel.yml
test-install:
needs: build
runs-on: ubuntu-22.04 # Simulating Colab environment
strategy:
matrix:
python-version: ['3.10', '3.11', '3.12']
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Download artifacts
uses: actions/download-artifact@v4
with:
pattern: wheels-*
path: dist
merge-multiple: true
- name: Test installation
run: |
python -m pip install --upgrade pip
# Find and install the appropriate wheels
pip install dist/leann_core-*.whl
pip install dist/leann_backend_hnsw-*manylinux*.whl
pip install dist/leann-*.whl
- name: Test import
run: |
python -c "
import leann
from leann import LeannBuilder, LeannSearcher
print('Successfully imported leann modules')
# Quick functionality test
builder = LeannBuilder(backend_name='hnsw')
builder.add_text('Test document')
print('LeannBuilder created and used successfully')
"

View File

@@ -0,0 +1,50 @@
# Manylinux Build Strategy
## Problem
Google Colab requires wheels compatible with `manylinux_2_35_x86_64` or earlier. Our previous builds were producing `manylinux_2_39_x86_64` wheels, which are incompatible.
## Solution
We're using `cibuildwheel` with `manylinux_2_35` images to build wheels that are compatible with Google Colab while maintaining modern toolchain features.
### Key Changes
1. **cibuildwheel Configuration**
- Using `manylinux2014` images (provides `manylinux_2_17` compatibility)
- Using `yum` package manager (CentOS 7 based)
- Installing `cmake3` and creating symlink for compatibility
2. **Build Matrix**
- Python versions: 3.9, 3.10, 3.11, 3.12, 3.13
- Platforms: Linux (x86_64), macOS
- No Windows support (not required)
3. **Dependencies**
- Linux: gcc-c++, boost-devel, zeromq-devel, openblas-devel, cmake3
- macOS: boost, zeromq, openblas, cmake (via Homebrew)
4. **Environment Variables**
- `CMAKE_BUILD_PARALLEL_LEVEL=8`: Speed up builds
- `Python_FIND_VIRTUALENV=ONLY`: Help CMake find Python in cibuildwheel env
- `Python3_FIND_VIRTUALENV=ONLY`: Alternative variable for compatibility
## Testing Strategy
1. **CI Pipeline**: `test-manylinux.yml`
- Triggers on PR to main, manual dispatch, or push to `fix/manylinux-*` branches
- Builds wheels using cibuildwheel
- Tests installation on Ubuntu 22.04 (simulating Colab)
2. **Local Testing**
```bash
# Download built wheels
# Test in fresh environment
python -m venv test_env
source test_env/bin/activate
pip install leann_core-*.whl leann_backend_hnsw-*manylinux*.whl leann-*.whl
python -c "from leann import LeannBuilder; print('Success!')"
```
## References
- [cibuildwheel documentation](https://cibuildwheel.readthedocs.io/)
- [manylinux standards](https://github.com/pypa/manylinux)
- [PEP 599 - manylinux2014](https://peps.python.org/pep-0599/)

View File

@@ -3,6 +3,34 @@
cmake_minimum_required(VERSION 3.20)
project(leann_backend_diskann_wrapper)
# Find Python - scikit-build-core should provide this
find_package(Python REQUIRED COMPONENTS Interpreter Development.Module)
# Print Python information for debugging
message(STATUS "Python_FOUND: ${Python_FOUND}")
message(STATUS "Python_VERSION: ${Python_VERSION}")
message(STATUS "Python_EXECUTABLE: ${Python_EXECUTABLE}")
message(STATUS "Python_INCLUDE_DIRS: ${Python_INCLUDE_DIRS}")
# Pass Python information to DiskANN through cache variables
set(Python_EXECUTABLE ${Python_EXECUTABLE} CACHE FILEPATH "Python executable" FORCE)
set(Python_INCLUDE_DIRS ${Python_INCLUDE_DIRS} CACHE PATH "Python include dirs" FORCE)
set(Python_LIBRARIES ${Python_LIBRARIES} CACHE FILEPATH "Python libraries" FORCE)
set(Python_VERSION ${Python_VERSION} CACHE STRING "Python version" FORCE)
set(Python_FOUND ${Python_FOUND} CACHE BOOL "Python found" FORCE)
# Also set Python3 variables for compatibility
set(Python3_EXECUTABLE ${Python_EXECUTABLE} CACHE FILEPATH "Python3 executable" FORCE)
set(Python3_INCLUDE_DIRS ${Python_INCLUDE_DIRS} CACHE PATH "Python3 include dirs" FORCE)
set(Python3_LIBRARIES ${Python_LIBRARIES} CACHE FILEPATH "Python3 libraries" FORCE)
set(Python3_VERSION ${Python_VERSION} CACHE STRING "Python3 version" FORCE)
set(Python3_FOUND ${Python_FOUND} CACHE BOOL "Python3 found" FORCE)
set(Python3_Development_FOUND TRUE CACHE BOOL "Python3 development found" FORCE)
# Set Python finding strategy
set(Python_FIND_VIRTUALENV ONLY CACHE STRING "" FORCE)
set(Python3_FIND_VIRTUALENV ONLY CACHE STRING "" FORCE)
# Tell CMake to directly enter the DiskANN submodule and execute its own CMakeLists.txt
# DiskANN will handle everything itself, including compiling Python bindings
add_subdirectory(src/third_party/DiskANN)
add_subdirectory(third_party/DiskANN)

View File

@@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build"
[project]
name = "leann-backend-diskann"
version = "0.1.10"
dependencies = ["leann-core==0.1.10", "numpy", "protobuf>=3.19.0"]
version = "0.1.8"
dependencies = ["leann-core==0.1.8", "numpy"]
[tool.scikit-build]
# Key: simplified CMake path
@@ -16,4 +16,9 @@ wheel.packages = ["leann_backend_diskann"]
editable.mode = "redirect"
cmake.build-type = "Release"
build.verbose = true
build.tool-args = ["-j8"]
build.tool-args = ["-j8"]
wheel.exclude = ["CMakeLists.txt", "src", "third_party/**", "*.o", "*.so"]
sdist.include = ["CMakeLists.txt", "src", "third_party", "leann_backend_diskann/*.txt"]
[tool.scikit-build.cmake.define]
CMAKE_BUILD_PARALLEL_LEVEL = "8"

View File

@@ -24,6 +24,38 @@ set(MSGPACK_USE_BOOST OFF CACHE BOOL "" FORCE)
add_compile_definitions(MSGPACK_NO_BOOST)
include_directories(third_party/msgpack-c/include)
# Find Python for our own use (not for Faiss)
if(DEFINED SKBUILD)
message(STATUS "Building with scikit-build")
# scikit-build-core provides Python information
endif()
# Find Python - scikit-build-core should provide this
find_package(Python REQUIRED COMPONENTS Interpreter Development.Module NumPy)
# Print Python information for debugging
message(STATUS "Python_FOUND: ${Python_FOUND}")
message(STATUS "Python_VERSION: ${Python_VERSION}")
message(STATUS "Python_EXECUTABLE: ${Python_EXECUTABLE}")
message(STATUS "Python_INCLUDE_DIRS: ${Python_INCLUDE_DIRS}")
message(STATUS "Python_NumPy_INCLUDE_DIRS: ${Python_NumPy_INCLUDE_DIRS}")
# Pass Python information to faiss through cache variables
set(Python_EXECUTABLE ${Python_EXECUTABLE} CACHE FILEPATH "Python executable" FORCE)
set(Python_INCLUDE_DIRS ${Python_INCLUDE_DIRS} CACHE PATH "Python include dirs" FORCE)
set(Python_NumPy_INCLUDE_DIRS ${Python_NumPy_INCLUDE_DIRS} CACHE PATH "NumPy include dirs" FORCE)
set(Python_VERSION ${Python_VERSION} CACHE STRING "Python version" FORCE)
set(Python_FOUND ${Python_FOUND} CACHE BOOL "Python found" FORCE)
# Also set Python3 variables for compatibility
set(Python3_EXECUTABLE ${Python_EXECUTABLE} CACHE FILEPATH "Python3 executable" FORCE)
set(Python3_INCLUDE_DIRS ${Python_INCLUDE_DIRS} CACHE PATH "Python3 include dirs" FORCE)
set(Python3_NumPy_INCLUDE_DIRS ${Python_NumPy_INCLUDE_DIRS} CACHE PATH "NumPy include dirs" FORCE)
set(Python3_VERSION ${Python_VERSION} CACHE STRING "Python3 version" FORCE)
set(Python3_FOUND ${Python_FOUND} CACHE BOOL "Python3 found" FORCE)
set(Python3_Development_FOUND TRUE CACHE BOOL "Python3 development found" FORCE)
set(Python3_NumPy_FOUND TRUE CACHE BOOL "Python3 NumPy found" FORCE)
# Faiss configuration - streamlined build
set(FAISS_ENABLE_PYTHON ON CACHE BOOL "" FORCE)
set(FAISS_ENABLE_GPU OFF CACHE BOOL "" FORCE)
@@ -52,4 +84,8 @@ set(FAISS_BUILD_AVX512 OFF CACHE BOOL "" FORCE)
# IMPORTANT: Disable building AVX versions to speed up compilation
set(FAISS_BUILD_AVX_VERSIONS OFF CACHE BOOL "" FORCE)
# Force Faiss to use our Python settings
set(Python_FIND_VIRTUALENV ONLY CACHE STRING "" FORCE)
set(Python3_FIND_VIRTUALENV ONLY CACHE STRING "" FORCE)
add_subdirectory(third_party/faiss)

View File

@@ -81,21 +81,7 @@ def create_hnsw_embedding_server(
with open(passages_file, "r") as f:
meta = json.load(f)
# Convert relative paths to absolute paths based on metadata file location
metadata_dir = Path(
passages_file
).parent.parent # Go up one level from the metadata file
passage_sources = []
for source in meta["passage_sources"]:
source_copy = source.copy()
# Convert relative paths to absolute paths
if not Path(source_copy["path"]).is_absolute():
source_copy["path"] = str(metadata_dir / source_copy["path"])
if not Path(source_copy["index_path"]).is_absolute():
source_copy["index_path"] = str(metadata_dir / source_copy["index_path"])
passage_sources.append(source_copy)
passages = PassageManager(passage_sources)
passages = PassageManager(meta["passage_sources"])
logger.info(
f"Loaded PassageManager with {len(passages.global_offset_map)} passages from metadata"
)
@@ -284,15 +270,15 @@ def create_hnsw_embedding_server(
if __name__ == "__main__":
import signal
import sys
def signal_handler(sig, frame):
logger.info(f"Received signal {sig}, shutting down gracefully...")
sys.exit(0)
# Register signal handlers for graceful shutdown
signal.signal(signal.SIGTERM, signal_handler)
signal.signal(signal.SIGINT, signal_handler)
parser = argparse.ArgumentParser(description="HNSW Embedding service")
parser.add_argument("--zmq-port", type=int, default=5555, help="ZMQ port to run on")
parser.add_argument(

View File

@@ -6,22 +6,24 @@ build-backend = "scikit_build_core.build"
[project]
name = "leann-backend-hnsw"
version = "0.1.10"
version = "0.1.8"
description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
dependencies = [
"leann-core==0.1.10",
"leann-core==0.1.8",
"numpy",
"pyzmq>=23.0.0",
"msgpack>=1.0.0",
]
[tool.scikit-build]
wheel.packages = ["leann_backend_hnsw"]
editable.mode = "redirect"
cmake.build-type = "Release"
build.verbose = true
build.tool-args = ["-j8"]
wheel.exclude = ["CMakeLists.txt", "src", "third_party"]
sdist.include = ["CMakeLists.txt", "src", "third_party", "leann_backend_hnsw/*.txt"]
cmake.args = ["-DCMAKE_BUILD_TYPE=Release"]
# Ensure CMake can find system libraries
build-dir = "build/{cache_tag}"
minimum-version = "build-system.requires"
# CMake definitions to optimize compilation
[tool.scikit-build.cmake.define]
CMAKE_BUILD_PARALLEL_LEVEL = "8"
CMAKE_BUILD_PARALLEL_LEVEL = "8"
SKBUILD_SOABI = "YES"

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "leann-core"
version = "0.1.10"
version = "0.1.8"
description = "Core API and plugin system for LEANN"
readme = "README.md"
requires-python = ">=3.9"
@@ -15,7 +15,7 @@ dependencies = [
"numpy>=1.20.0",
"tqdm>=4.60.0",
"psutil>=5.8.0",
"pyzmq>=23.0.0",
"pyzmq>=23.0.0,<27", # Cap at 26.x for manylinux2014 compatibility
"msgpack>=1.0.0",
"torch>=2.0.0",
"sentence-transformers>=2.2.0",

View File

@@ -269,9 +269,7 @@ class EmbeddingServerManager:
]
if kwargs.get("passages_file"):
# Convert to absolute path to ensure subprocess can find the file
passages_file = Path(kwargs["passages_file"]).resolve()
command.extend(["--passages-file", str(passages_file)])
command.extend(["--passages-file", str(kwargs["passages_file"])])
if embedding_mode != "sentence-transformers":
command.extend(["--embedding-mode", embedding_mode])

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "leann"
version = "0.1.10"
version = "0.1.8"
description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
readme = "README.md"
requires-python = ">=3.9"

View File

@@ -60,3 +60,29 @@ py-modules = []
leann-core = { path = "packages/leann-core", editable = true }
leann-backend-diskann = { path = "packages/leann-backend-diskann", editable = true }
leann-backend-hnsw = { path = "packages/leann-backend-hnsw", editable = true }
[tool.cibuildwheel]
# Skip 32-bit and PyPy builds
skip = "*-win32 *-manylinux_i686 pp* *musllinux*"
# Use manylinux_2_35 for Colab compatibility while keeping modern features
manylinux-x86_64-image = "manylinux_2_35"
manylinux-aarch64-image = "manylinux_2_35"
# Linux system dependencies
[tool.cibuildwheel.linux]
before-all = """
yum install -y epel-release
yum install -y gcc-c++ boost-devel zeromq-devel openblas-devel cmake3 python3-devel
ln -sf /usr/bin/cmake3 /usr/bin/cmake
"""
# macOS system dependencies
[tool.cibuildwheel.macos]
before-all = "brew install boost zeromq openblas cmake libomp"
# Set minimum macOS version
environment = { MACOSX_DEPLOYMENT_TARGET = "11.0", CMAKE_OSX_DEPLOYMENT_TARGET = "11.0" }
# Environment variables configuration
[tool.cibuildwheel.environment]
CMAKE_BUILD_PARALLEL_LEVEL = "8"

28
uv.lock generated
View File

@@ -1800,7 +1800,7 @@ wheels = [
[[package]]
name = "leann-backend-diskann"
version = "0.1.0"
version = "0.1.8"
source = { editable = "packages/leann-backend-diskann" }
dependencies = [
{ name = "leann-core" },
@@ -1810,39 +1810,57 @@ dependencies = [
[package.metadata]
requires-dist = [
{ name = "leann-core", specifier = "==0.1.0" },
{ name = "leann-core", specifier = "==0.1.8" },
{ name = "numpy" },
]
[[package]]
name = "leann-backend-hnsw"
version = "0.1.0"
version = "0.1.8"
source = { editable = "packages/leann-backend-hnsw" }
dependencies = [
{ name = "leann-core" },
{ name = "msgpack" },
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
{ name = "numpy", version = "2.3.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
{ name = "pyzmq" },
]
[package.metadata]
requires-dist = [
{ name = "leann-core", specifier = "==0.1.0" },
{ name = "leann-core", specifier = "==0.1.8" },
{ name = "msgpack", specifier = ">=1.0.0" },
{ name = "numpy" },
{ name = "pyzmq", specifier = ">=23.0.0" },
]
[[package]]
name = "leann-core"
version = "0.1.0"
version = "0.1.8"
source = { editable = "packages/leann-core" }
dependencies = [
{ name = "llama-index-core" },
{ name = "msgpack" },
{ name = "numpy", version = "2.2.6", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" },
{ name = "numpy", version = "2.3.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.11'" },
{ name = "psutil" },
{ name = "python-dotenv" },
{ name = "pyzmq" },
{ name = "sentence-transformers" },
{ name = "torch" },
{ name = "tqdm" },
]
[package.metadata]
requires-dist = [
{ name = "llama-index-core", specifier = ">=0.12.0" },
{ name = "msgpack", specifier = ">=1.0.0" },
{ name = "numpy", specifier = ">=1.20.0" },
{ name = "psutil", specifier = ">=5.8.0" },
{ name = "python-dotenv", specifier = ">=1.0.0" },
{ name = "pyzmq", specifier = ">=23.0.0" },
{ name = "sentence-transformers", specifier = ">=2.2.0" },
{ name = "torch", specifier = ">=2.0.0" },
{ name = "tqdm", specifier = ">=4.60.0" },
]