fix(hnsw-server): robust ZMQ responses to prevent size mismatch and segfault in CI

sky: expand leann-build.yaml with configurable params and flags (backend, recompute, compact, embedding options)
docs: dedupe recomputation guidance; keep single Low-resource setups section
2025-08-13 14:53:46 -07:00 · 2025-08-13 14:18:48 -07:00 · 2025-08-13 14:10:10 -07:00 · 2025-08-13 14:09:05 -07:00 · 2025-08-13 14:08:23 -07:00 · 2025-08-13 14:06:22 -07:00
11 changed files with 336 additions and 83 deletions
--- a/.github/workflows/build-reusable.yml
+++ b/.github/workflows/build-reusable.yml
@@ -64,6 +64,16 @@ jobs:
            python: '3.12'
          - os: macos-14
            python: '3.13'
+          - os: macos-15
+            python: '3.9'
+          - os: macos-15
+            python: '3.10'
+          - os: macos-15
+            python: '3.11'
+          - os: macos-15
+            python: '3.12'
+          - os: macos-15
+            python: '3.13'
          - os: macos-13
            python: '3.9'
          - os: macos-13
@@ -147,7 +157,14 @@ jobs:
            # Use system clang for better compatibility
            export CC=clang
            export CXX=clang++
-            export MACOSX_DEPLOYMENT_TARGET=11.0
+            # Homebrew libraries on each macOS version require matching minimum version
+            if [[ "${{ matrix.os }}" == "macos-13" ]]; then
+              export MACOSX_DEPLOYMENT_TARGET=13.0
+            elif [[ "${{ matrix.os }}" == "macos-14" ]]; then
+              export MACOSX_DEPLOYMENT_TARGET=14.0
+            elif [[ "${{ matrix.os }}" == "macos-15" ]]; then
+              export MACOSX_DEPLOYMENT_TARGET=15.0
+            fi
            uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
          else
            uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
@@ -161,7 +178,14 @@ jobs:
            export CC=clang
            export CXX=clang++
            # DiskANN requires macOS 13.3+ for sgesdd_ LAPACK function
-            export MACOSX_DEPLOYMENT_TARGET=13.3
+            # But Homebrew libraries on each macOS version require matching minimum version
+            if [[ "${{ matrix.os }}" == "macos-13" ]]; then
+              export MACOSX_DEPLOYMENT_TARGET=13.3
+            elif [[ "${{ matrix.os }}" == "macos-14" ]]; then
+              export MACOSX_DEPLOYMENT_TARGET=14.0
+            elif [[ "${{ matrix.os }}" == "macos-15" ]]; then
+              export MACOSX_DEPLOYMENT_TARGET=15.0
+            fi
            uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
          else
            uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
@@ -197,10 +221,24 @@ jobs:
      - name: Repair wheels (macOS)
        if: runner.os == 'macOS'
        run: |
+          # Determine deployment target based on runner OS
+          # Must match the Homebrew libraries for each macOS version
+          if [[ "${{ matrix.os }}" == "macos-13" ]]; then
+            HNSW_TARGET="13.0"
+            DISKANN_TARGET="13.3"
+          elif [[ "${{ matrix.os }}" == "macos-14" ]]; then
+            HNSW_TARGET="14.0"
+            DISKANN_TARGET="14.0"
+          elif [[ "${{ matrix.os }}" == "macos-15" ]]; then
+            HNSW_TARGET="15.0"
+            DISKANN_TARGET="15.0"
+          fi
+
          # Repair HNSW wheel
          cd packages/leann-backend-hnsw
          if [ -d dist ]; then
-            delocate-wheel -w dist_repaired -v dist/*.whl
+            export MACOSX_DEPLOYMENT_TARGET=$HNSW_TARGET
+            delocate-wheel -w dist_repaired -v --require-target-macos-version $HNSW_TARGET dist/*.whl
            rm -rf dist
            mv dist_repaired dist
          fi
@@ -209,7 +247,8 @@ jobs:
          # Repair DiskANN wheel
          cd packages/leann-backend-diskann
          if [ -d dist ]; then
-            delocate-wheel -w dist_repaired -v dist/*.whl
+            export MACOSX_DEPLOYMENT_TARGET=$DISKANN_TARGET
+            delocate-wheel -w dist_repaired -v --require-target-macos-version $DISKANN_TARGET dist/*.whl
            rm -rf dist
            mv dist_repaired dist
          fi
@@ -249,8 +288,8 @@ jobs:
          # Activate virtual environment
          source .venv/bin/activate || source .venv/Scripts/activate

-          # Run all tests
-          pytest tests/
+          # Run tests
+          pytest -v tests/

      - name: Run sanity checks (optional)
        run: |
--- a/README.md
+++ b/README.md
@@ -71,6 +71,8 @@ source .venv/bin/activate
 uv pip install leann
 ```

+> Low-resource? See “Low-resource setups” in the [Configuration Guide](docs/configuration-guide.md#low-resource-setups).
+
 <details>
 <summary>
 <strong>🔧 Build from Source (Recommended for development)</strong>
--- a/docs/configuration-guide.md
+++ b/docs/configuration-guide.md
@@ -259,24 +259,80 @@ Every configuration choice involves trade-offs:

 The key is finding the right balance for your specific use case. Start small and simple, measure performance, then scale up only where needed.

-## Deep Dive: Critical Configuration Decisions
+## Low-resource setups

-### When to Disable Recomputation
+If you don’t have a local GPU or builds/searches are too slow, use one or more of the options below.

-LEANN's recomputation feature provides exact distance calculations but can be disabled for extreme QPS requirements:
+### 1) Use OpenAI embeddings (no local compute)
+
+Fastest path with zero local GPU requirements. Set your API key and use OpenAI embeddings during build and search:

 ```bash
--no-recompute  # Disable selective recomputation
+export OPENAI_API_KEY=sk-...
+
+# Build with OpenAI embeddings
+leann build my-index \
+  --embedding-mode openai \
+  --embedding-model text-embedding-3-small
+
+# Search with OpenAI embeddings (recompute at query time)
+leann search my-index "your query" \
+  --recompute-embeddings
 ```

-**Trade-offs**:
- **With recomputation** (default): Exact distances, best quality, higher latency, minimal storage (only stores metadata, recomputes embeddings on-demand)
- **Without recomputation**: Must store full embeddings, significantly higher memory and storage usage (10-100x more), but faster search
+### 2) Run remote builds with SkyPilot (cloud GPU)

-**Disable when**:
- You have abundant storage and memory
- Need extremely low latency (< 100ms)
- Running a read-heavy workload where storage cost is acceptable
+Offload embedding generation and index building to a GPU VM using SkyPilot. A template is provided at `sky/leann-build.yaml`.
+
+```bash
+# One-time: install and configure SkyPilot
+pip install skypilot
+sky launch -c leann-gpu sky/leann-build.yaml
+
+# Build remotely (template installs uv + leann CLI)
+sky exec leann-gpu -- "leann build my-index --docs ~/leann-data --backend hnsw --complexity 64 --graph-degree 32"
+```
+
+Details: see “Running Builds on SkyPilot (Optional)” below.
+
+### 3) Disable recomputation to trade storage for speed
+
+If you need lower latency and have more storage/memory, disable recomputation. This stores full embeddings and avoids recomputing at search time.
+
+```bash
+# Build without recomputation (HNSW requires non-compact in this mode)
+leann build my-index --no-recompute --no-compact
+
+# Search without recomputation
+leann search my-index "your query" --no-recompute
+```
+
+Trade-offs: lower query-time latency, but significantly higher storage usage.
+
+## Running Builds on SkyPilot (Optional)
+
+You can offload embedding generation and index building to a cloud GPU VM using SkyPilot, without changing any LEANN code. This is useful when your local machine lacks a GPU or you want faster throughput.
+
+### Quick Start
+
+1) Install SkyPilot by following their docs (`pip install skypilot`), then configure cloud credentials.
+
+2) Use the provided SkyPilot template:
+
+```bash
+sky launch -c leann-gpu sky/leann-build.yaml
+```
+
+3) On the remote, either put your data under the mounted path or adjust `file_mounts` in `sky/leann-build.yaml`. Then run the LEANN build:
+
+```bash
+sky exec leann-gpu -- "leann build my-index --docs ~/leann-data --backend hnsw --complexity 64 --graph-degree 32"
+```
+
+Notes:
+- The template installs `uv` and the `leann` CLI globally on the remote instance.
+- Change the `accelerators` and `cloud` settings in `sky/leann-build.yaml` to match your budget/availability (e.g., `A10G:1`, `A100:1`, or CPU-only if you prefer).
+- You can also build with `diskann` by switching `--backend diskann`.

 ## Further Reading

--- a/packages/leann-backend-diskann/pyproject.toml
+++ b/packages/leann-backend-diskann/pyproject.toml
@@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build"

 [project]
 name = "leann-backend-diskann"
-version = "0.2.8"
-dependencies = ["leann-core==0.2.8", "numpy", "protobuf>=3.19.0"]
+version = "0.2.9"
+dependencies = ["leann-core==0.2.9", "numpy", "protobuf>=3.19.0"]

 [tool.scikit-build]
 # Key: simplified CMake path
--- a/packages/leann-backend-hnsw/leann_backend_hnsw/hnsw_embedding_server.py
+++ b/packages/leann-backend-hnsw/leann_backend_hnsw/hnsw_embedding_server.py
@@ -95,6 +95,8 @@ def create_hnsw_embedding_server(
        passage_sources.append(source_copy)

    passages = PassageManager(passage_sources)
+    # Use index dimensions from metadata for shaping fallback responses
+    embedding_dim: int = int(meta.get("dimensions", 0))
    logger.info(
        f"Loaded PassageManager with {len(passages.global_offset_map)} passages from metadata"
    )
@@ -109,6 +111,9 @@ def create_hnsw_embedding_server(
        socket.setsockopt(zmq.RCVTIMEO, 300000)
        socket.setsockopt(zmq.SNDTIMEO, 300000)

+        # Track last request type for safe fallback responses on exceptions
+        last_request_type = "unknown"  # one of: 'text', 'distance', 'embedding', 'unknown'
+        last_request_length = 0
        while True:
            try:
                message_bytes = socket.recv()
@@ -121,6 +126,8 @@ def create_hnsw_embedding_server(
                if isinstance(request_payload, list) and len(request_payload) > 0:
                    # Check if this is a direct text request (list of strings)
                    if all(isinstance(item, str) for item in request_payload):
+                        last_request_type = "text"
+                        last_request_length = len(request_payload)
                        logger.info(
                            f"Processing direct text embedding request for {len(request_payload)} texts in {embedding_mode} mode"
                        )
@@ -145,43 +152,66 @@ def create_hnsw_embedding_server(
                ):
                    node_ids = request_payload[0]
                    query_vector = np.array(request_payload[1], dtype=np.float32)
+                    last_request_type = "distance"
+                    last_request_length = len(node_ids)

                    logger.debug("Distance calculation request received")
                    logger.debug(f"    Node IDs: {node_ids}")
                    logger.debug(f"    Query vector dim: {len(query_vector)}")

-                    # Get embeddings for node IDs
-                    texts = []
-                    for nid in node_ids:
+                    # Get embeddings for node IDs, tolerate missing IDs
+                    texts: list[str] = []
+                    found_indices: list[int] = []
+                    for idx, nid in enumerate(node_ids):
                        try:
                            passage_data = passages.get_passage(str(nid))
-                            txt = passage_data["text"]
-                            texts.append(txt)
+                            txt = passage_data.get("text", "")
+                            if isinstance(txt, str) and len(txt) > 0:
+                                texts.append(txt)
+                                found_indices.append(idx)
+                            else:
+                                logger.error(f"Empty text for passage ID {nid}")
                        except KeyError:
                            logger.error(f"Passage ID {nid} not found")
-                            raise RuntimeError(f"FATAL: Passage with ID {nid} not found")
                        except Exception as e:
                            logger.error(f"Exception looking up passage ID {nid}: {e}")
-                            raise

-                    # Process embeddings
-                    embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
-                    logger.info(
-                        f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
+                    # Prepare full-length response distances with safe fallbacks
+                    large_distance = 1e9
+                    response_distances = [large_distance] * len(node_ids)
+
+                    if texts:
+                        try:
+                            # Process embeddings only for found indices
+                            embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
+                            logger.info(
+                                f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
+                            )
+
+                            # Calculate distances for found embeddings only
+                            if distance_metric == "l2":
+                                partial_distances = np.sum(
+                                    np.square(embeddings - query_vector.reshape(1, -1)), axis=1
+                                )
+                            else:  # mips or cosine
+                                partial_distances = -np.dot(embeddings, query_vector)
+
+                            # Place computed distances back into the full response array
+                            for pos, dval in zip(
+                                found_indices, partial_distances.flatten().tolist()
+                            ):
+                                response_distances[pos] = float(dval)
+                        except Exception as e:
+                            logger.error(
+                                f"Distance computation error, falling back to large distances: {e}"
+                            )
+
+                    # Always reply with exactly len(node_ids) distances
+                    response_bytes = msgpack.packb([response_distances], use_single_float=True)
+                    logger.debug(
+                        f"Sending distance response with {len(response_distances)} distances (found={len(found_indices)})"
                    )

-                    # Calculate distances
-                    if distance_metric == "l2":
-                        distances = np.sum(
-                            np.square(embeddings - query_vector.reshape(1, -1)), axis=1
-                        )
-                    else:  # mips or cosine
-                        distances = -np.dot(embeddings, query_vector)
-
-                    response_payload = distances.flatten().tolist()
-                    response_bytes = msgpack.packb([response_payload], use_single_float=True)
-                    logger.debug(f"Sending distance response with {len(distances)} distances")
-
                    socket.send(response_bytes)
                    e2e_end = time.time()
                    logger.info(f"⏱️  Distance calculation E2E time: {e2e_end - e2e_start:.6f}s")
@@ -201,40 +231,61 @@ def create_hnsw_embedding_server(

                node_ids = request_payload[0]
                logger.debug(f"Request for {len(node_ids)} node embeddings")
+                last_request_type = "embedding"
+                last_request_length = len(node_ids)

-                # Look up texts by node IDs
-                texts = []
-                for nid in node_ids:
+                # Allocate output buffer (B, D) and fill with zeros for robustness
+                if embedding_dim <= 0:
+                    logger.error("Embedding dimension unknown; cannot serve embedding request")
+                    dims = [0, 0]
+                    data = []
+                else:
+                    dims = [len(node_ids), embedding_dim]
+                    data = [0.0] * (dims[0] * dims[1])
+
+                # Look up texts by node IDs; compute embeddings where available
+                texts: list[str] = []
+                found_indices: list[int] = []
+                for idx, nid in enumerate(node_ids):
                    try:
                        passage_data = passages.get_passage(str(nid))
-                        txt = passage_data["text"]
-                        if not txt:
-                            raise RuntimeError(f"FATAL: Empty text for passage ID {nid}")
-                        texts.append(txt)
+                        txt = passage_data.get("text", "")
+                        if isinstance(txt, str) and len(txt) > 0:
+                            texts.append(txt)
+                            found_indices.append(idx)
+                        else:
+                            logger.error(f"Empty text for passage ID {nid}")
                    except KeyError:
-                        raise RuntimeError(f"FATAL: Passage with ID {nid} not found")
+                        logger.error(f"Passage with ID {nid} not found")
                    except Exception as e:
                        logger.error(f"Exception looking up passage ID {nid}: {e}")
-                        raise

-                # Process embeddings
-                embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
-                logger.info(
-                    f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
-                )
+                if texts:
+                    try:
+                        # Process embeddings for found texts only
+                        embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
+                        logger.info(
+                            f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
+                        )

-                # Serialization and response
-                if np.isnan(embeddings).any() or np.isinf(embeddings).any():
-                    logger.error(
-                        f"NaN or Inf detected in embeddings! Requested IDs: {node_ids[:5]}..."
-                    )
-                    raise AssertionError()
+                        if np.isnan(embeddings).any() or np.isinf(embeddings).any():
+                            logger.error(
+                                f"NaN or Inf detected in embeddings! Requested IDs: {node_ids[:5]}..."
+                            )
+                            dims = [0, embedding_dim]
+                            data = []
+                        else:
+                            # Copy computed embeddings into the correct positions
+                            emb_f32 = np.ascontiguousarray(embeddings, dtype=np.float32)
+                            flat = emb_f32.flatten().tolist()
+                            for j, pos in enumerate(found_indices):
+                                start = pos * embedding_dim
+                                end = start + embedding_dim
+                                data[start:end] = flat[j * embedding_dim : (j + 1) * embedding_dim]
+                    except Exception as e:
+                        logger.error(f"Embedding computation error, returning zeros: {e}")

-                hidden_contiguous_f32 = np.ascontiguousarray(embeddings, dtype=np.float32)
-                response_payload = [
-                    list(hidden_contiguous_f32.shape),
-                    hidden_contiguous_f32.flatten().tolist(),
-                ]
+                response_payload = [dims, data]
                response_bytes = msgpack.packb(response_payload, use_single_float=True)

                socket.send(response_bytes)
@@ -249,7 +300,22 @@ def create_hnsw_embedding_server(
                import traceback

                traceback.print_exc()
-                socket.send(msgpack.packb([[], []]))
+                # Fallback to a safe, minimal-structure response to avoid client crashes
+                if last_request_type == "distance":
+                    # Return a vector of large distances with the expected length
+                    fallback_len = max(0, int(last_request_length))
+                    large_distance = 1e9
+                    safe_response = [[large_distance] * fallback_len]
+                elif last_request_type == "embedding":
+                    # Return an empty embedding block with known dimension if available
+                    if embedding_dim > 0:
+                        safe_response = [[0, embedding_dim], []]
+                    else:
+                        safe_response = [[0, 0], []]
+                else:
+                    # Unknown request type: default to empty embedding structure
+                    safe_response = [[0, int(embedding_dim) if embedding_dim > 0 else 0], []]
+                socket.send(msgpack.packb(safe_response, use_single_float=True))

    zmq_thread = threading.Thread(target=zmq_server_thread, daemon=True)
    zmq_thread.start()
--- a/packages/leann-backend-hnsw/pyproject.toml
+++ b/packages/leann-backend-hnsw/pyproject.toml
@@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build"

 [project]
 name = "leann-backend-hnsw"
-version = "0.2.8"
+version = "0.2.9"
 description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
 dependencies = [
-    "leann-core==0.2.8",
+    "leann-core==0.2.9",
    "numpy",
    "pyzmq>=23.0.0",
    "msgpack>=1.0.0",
--- a/packages/leann-core/pyproject.toml
+++ b/packages/leann-core/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "leann-core"
-version = "0.2.8"
+version = "0.2.9"
 description = "Core API and plugin system for LEANN"
 readme = "README.md"
 requires-python = ">=3.9"
--- a/packages/leann-core/src/leann/cli.py
+++ b/packages/leann-core/src/leann/cli.py
@@ -117,7 +117,19 @@ Examples:
        build_parser.add_argument("--complexity", type=int, default=64)
        build_parser.add_argument("--num-threads", type=int, default=1)
        build_parser.add_argument("--compact", action="store_true", default=True)
+        build_parser.add_argument(
+            "--no-compact",
+            dest="compact",
+            action="store_false",
+            help="Disable compact index storage (store full embeddings; higher storage)",
+        )
        build_parser.add_argument("--recompute", action="store_true", default=True)
+        build_parser.add_argument(
+            "--no-recompute",
+            dest="recompute",
+            action="store_false",
+            help="Disable embedding recomputation (store full embeddings; lower query latency)",
+        )
        build_parser.add_argument(
            "--file-types",
            type=str,
@@ -138,6 +150,18 @@ Examples:
            default=True,
            help="Recompute embeddings (default: True)",
        )
+        search_parser.add_argument(
+            "--no-recompute-embeddings",
+            dest="recompute_embeddings",
+            action="store_false",
+            help="Disable embedding recomputation during search",
+        )
+        search_parser.add_argument(
+            "--no-recompute",
+            dest="recompute_embeddings",
+            action="store_false",
+            help="Alias for --no-recompute-embeddings",
+        )
        search_parser.add_argument(
            "--pruning-strategy",
            choices=["global", "local", "proportional"],
@@ -166,6 +190,18 @@ Examples:
            default=True,
            help="Recompute embeddings (default: True)",
        )
+        ask_parser.add_argument(
+            "--no-recompute-embeddings",
+            dest="recompute_embeddings",
+            action="store_false",
+            help="Disable embedding recomputation during ask",
+        )
+        ask_parser.add_argument(
+            "--no-recompute",
+            dest="recompute_embeddings",
+            action="store_false",
+            help="Alias for --no-recompute-embeddings",
+        )
        ask_parser.add_argument(
            "--pruning-strategy",
            choices=["global", "local", "proportional"],
--- a/packages/leann-mcp/README.md
+++ b/packages/leann-mcp/README.md
@@ -4,20 +4,12 @@ Transform your development workflow with intelligent code assistance using LEANN

 ## Prerequisites

-**Step 1:** First, complete the basic LEANN installation following the [📦 Installation guide](../../README.md#installation) in the root README:
+Install LEANN globally for MCP integration (with default backend):

 ```bash
-uv venv
-source .venv/bin/activate
-uv pip install leann
+uv tool install leann-core --with leann
 ```
-
-**Step 2:** Install LEANN globally for MCP integration:
-```bash
-uv tool install leann-core
-```
-
-This makes the `leann` command available system-wide, which `leann_mcp` requires.
+This installs the `leann` CLI into an isolated tool environment and includes both backends so `leann build` works out-of-the-box.

 ## 🚀 Quick Setup

--- a/packages/leann/pyproject.toml
+++ b/packages/leann/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "leann"
-version = "0.2.8"
+version = "0.2.9"
 description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
 readme = "README.md"
 requires-python = ">=3.9"
--- a/sky/leann-build.yaml
+++ b/sky/leann-build.yaml
@@ -0,0 +1,62 @@
+name: leann-build
+
+resources:
+  # Choose a GPU for fast embeddings (examples: L4, A10G, A100). CPU also works but is slower.
+  accelerators: L4:1
+  # Optionally pin a cloud, otherwise SkyPilot will auto-select
+  # cloud: aws
+  disk_size: 100
+
+env:
+  # Build parameters (override with: sky launch -c leann-gpu sky/leann-build.yaml -e key=value)
+  index_name: my-index
+  docs: ./data
+  backend: hnsw               # hnsw | diskann
+  complexity: 64
+  graph_degree: 32
+  num_threads: 8
+  # Embedding selection
+  embedding_mode: sentence-transformers   # sentence-transformers | openai | mlx | ollama
+  embedding_model: facebook/contriever
+  # Storage/latency knobs
+  recompute: true             # true => selective recomputation; false => store full embeddings
+  compact: true               # for HNSW only: false when recompute=false
+  # Optional pass-through
+  extra_args: ""
+
+# Sync local paths to the remote VM. Adjust as needed.
+file_mounts:
+  # Example: mount your local data directory used for building
+  ~/leann-data: ${docs}
+
+setup: |
+  set -e
+  # Install uv (package manager)
+  curl -LsSf https://astral.sh/uv/install.sh | sh
+  export PATH="$HOME/.local/bin:$PATH"
+
+  # Install the LEANN CLI globally on the remote machine
+  uv tool install leann
+
+run: |
+  export PATH="$HOME/.local/bin:$PATH"
+  # Derive flags from env
+  recompute_flag=""
+  if [ "${recompute}" = "false" ] || [ "${recompute}" = "0" ]; then
+    recompute_flag="--no-recompute"
+  fi
+  compact_flag=""
+  if [ "${compact}" = "false" ] || [ "${compact}" = "0" ]; then
+    compact_flag="--no-compact"
+  fi
+
+  # Build command
+  leann build ${index_name} \
+    --docs ~/leann-data \
+    --backend ${backend} \
+    --complexity ${complexity} \
+    --graph-degree ${graph_degree} \
+    --num-threads ${num_threads} \
+    --embedding-mode ${embedding_mode} \
+    --embedding-model ${embedding_model} \
+    ${recompute_flag} ${compact_flag} ${extra_args}
Author	SHA1	Message	Date
Andy Lee	909d3cc6a8	fix(hnsw-server): robust ZMQ responses to prevent size mismatch and segfault in CI	2025-08-13 14:53:46 -07:00
Andy Lee	c994635af6	sky: expand leann-build.yaml with configurable params and flags (backend, recompute, compact, embedding options)	2025-08-13 14:18:48 -07:00
Andy Lee	23b80647c5	docs: dedupe recomputation guidance; keep single Low-resource setups section	2025-08-13 14:10:10 -07:00
Andy Lee	50121972ee	cli: add --no-recompute and --no-recompute-embeddings flags; docs: clarify HNSW requires --no-compact when disabling recompute	2025-08-13 14:09:05 -07:00
Andy Lee	07e5f10204	docs: consolidate low-resource guidance into config guide; README points to it	2025-08-13 14:08:23 -07:00
Andy Lee	58711bff7e	docs: add low-resource note in README; point to config guide; suggest OpenAI embeddings, SkyPilot remote build, and --no-recompute	2025-08-13 14:06:22 -07:00
Andy Lee	a69464eb16	docs: add SkyPilot template and instructions for running embeddings/index build on cloud GPU	2025-08-13 14:01:32 -07:00
Andy Lee	46565b9249	docs: follows #34 , patch leann backends into tool environment	2025-08-12 17:56:02 -07:00
GitHub Actions	3dad76126a	chore: release v0.2.9	2025-08-12 23:00:12 +00:00
Andy Lee	18e28bda32	feat: Add macOS 15 support for M4 Mac compatibility (#38 ) * feat: add macOS 15 support for M4 Mac compatibility - Add macos-15 CI builds for Python 3.9-3.13 - Update MACOSX_DEPLOYMENT_TARGET from 11.0/13.3 to 14.0 for broader compatibility - Addresses issue #34 with Mac M4 wheel compatibility 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: ensure wheels are compatible with older macOS versions - Set MACOSX_DEPLOYMENT_TARGET=11.0 for HNSW backend (broad compatibility) - Set MACOSX_DEPLOYMENT_TARGET=13.0 for DiskANN backend (required for LAPACK) - Add --require-target-macos-version to delocate-wheel commands - This fixes CI failures on macos-13 runners while maintaining M4 Mac support Fixes the issue where wheels built on macos-14 runners were incorrectly tagged as macosx_14_0, preventing installation on macos-13 runners. * fix: use macOS 13.3 for DiskANN backend as required by LAPACK DiskANN requires macOS 13.3+ for sgesdd_ LAPACK function, so we must use 13.3 as the deployment target, not 13.0. * fix: match deployment target with runner OS for library compatibility The issue is that Homebrew libraries on macOS 14 runners are built for macOS 14 and cannot be downgraded. We must use different deployment targets based on the runner OS: - macOS 13 runners: Can build for macOS 11.0 (HNSW) and 13.3 (DiskANN) - macOS 14 runners: Must build for macOS 14.0 (due to system libraries) This ensures delocate-wheel succeeds by matching the deployment target with the actual minimum version required by bundled libraries. * fix: add macOS 15 support to deployment target configuration The issue extends to macOS 15 runners where Homebrew libraries are built for macOS 15. We must handle all runner versions explicitly: - macOS 13 runners: Can build for macOS 11.0 (HNSW) and 13.3 (DiskANN) - macOS 14 runners: Must build for macOS 14.0 (system libraries) - macOS 15 runners: Must build for macOS 15.0 (system libraries) This ensures wheels are properly tagged for their actual minimum supported macOS version, matching the bundled libraries. * fix: correct macOS deployment targets based on Homebrew library requirements The key insight is that Homebrew libraries on each macOS version are compiled for that specific version: - macOS 13: Libraries require macOS 13.0 minimum - macOS 14: Libraries require macOS 14.0 minimum - macOS 15: Libraries require macOS 15.0 minimum We cannot build wheels for older macOS versions than what the bundled Homebrew libraries require. This means: - macOS 13 runners: Build for macOS 13.0+ (HNSW) and 13.3+ (DiskANN) - macOS 14 runners: Build for macOS 14.0+ - macOS 15 runners: Build for macOS 15.0+ This ensures delocate-wheel succeeds by matching deployment targets with the actual minimum versions required by system libraries. * fix: restore macOS 15 build matrix and correct test path - Add back macOS 15 configurations for Python 3.9-3.13 - Fix pytest path from test/ to tests/ (correct directory name) The macOS 15 support was accidentally missing from the matrix, and pytest was looking for the wrong directory name. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-12 14:01:02 -07:00