Compare commits

...

10 Commits

Author SHA1 Message Date
Andy Lee
909d3cc6a8 fix(hnsw-server): robust ZMQ responses to prevent size mismatch and segfault in CI 2025-08-13 14:53:46 -07:00
Andy Lee
c994635af6 sky: expand leann-build.yaml with configurable params and flags (backend, recompute, compact, embedding options) 2025-08-13 14:18:48 -07:00
Andy Lee
23b80647c5 docs: dedupe recomputation guidance; keep single Low-resource setups section 2025-08-13 14:10:10 -07:00
Andy Lee
50121972ee cli: add --no-recompute and --no-recompute-embeddings flags; docs: clarify HNSW requires --no-compact when disabling recompute 2025-08-13 14:09:05 -07:00
Andy Lee
07e5f10204 docs: consolidate low-resource guidance into config guide; README points to it 2025-08-13 14:08:23 -07:00
Andy Lee
58711bff7e docs: add low-resource note in README; point to config guide; suggest OpenAI embeddings, SkyPilot remote build, and --no-recompute 2025-08-13 14:06:22 -07:00
Andy Lee
a69464eb16 docs: add SkyPilot template and instructions for running embeddings/index build on cloud GPU 2025-08-13 14:01:32 -07:00
Andy Lee
46565b9249 docs: follows #34, patch leann backends into tool environment 2025-08-12 17:56:02 -07:00
GitHub Actions
3dad76126a chore: release v0.2.9 2025-08-12 23:00:12 +00:00
Andy Lee
18e28bda32 feat: Add macOS 15 support for M4 Mac compatibility (#38)
* feat: add macOS 15 support for M4 Mac compatibility

- Add macos-15 CI builds for Python 3.9-3.13
- Update MACOSX_DEPLOYMENT_TARGET from 11.0/13.3 to 14.0 for broader compatibility
- Addresses issue #34 with Mac M4 wheel compatibility

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: ensure wheels are compatible with older macOS versions

- Set MACOSX_DEPLOYMENT_TARGET=11.0 for HNSW backend (broad compatibility)
- Set MACOSX_DEPLOYMENT_TARGET=13.0 for DiskANN backend (required for LAPACK)
- Add --require-target-macos-version to delocate-wheel commands
- This fixes CI failures on macos-13 runners while maintaining M4 Mac support

Fixes the issue where wheels built on macos-14 runners were incorrectly
tagged as macosx_14_0, preventing installation on macos-13 runners.

* fix: use macOS 13.3 for DiskANN backend as required by LAPACK

DiskANN requires macOS 13.3+ for sgesdd_ LAPACK function, so we must
use 13.3 as the deployment target, not 13.0.

* fix: match deployment target with runner OS for library compatibility

The issue is that Homebrew libraries on macOS 14 runners are built for
macOS 14 and cannot be downgraded. We must use different deployment
targets based on the runner OS:

- macOS 13 runners: Can build for macOS 11.0 (HNSW) and 13.3 (DiskANN)
- macOS 14 runners: Must build for macOS 14.0 (due to system libraries)

This ensures delocate-wheel succeeds by matching the deployment target
with the actual minimum version required by bundled libraries.

* fix: add macOS 15 support to deployment target configuration

The issue extends to macOS 15 runners where Homebrew libraries are built
for macOS 15. We must handle all runner versions explicitly:

- macOS 13 runners: Can build for macOS 11.0 (HNSW) and 13.3 (DiskANN)
- macOS 14 runners: Must build for macOS 14.0 (system libraries)
- macOS 15 runners: Must build for macOS 15.0 (system libraries)

This ensures wheels are properly tagged for their actual minimum
supported macOS version, matching the bundled libraries.

* fix: correct macOS deployment targets based on Homebrew library requirements

The key insight is that Homebrew libraries on each macOS version are
compiled for that specific version:
- macOS 13: Libraries require macOS 13.0 minimum
- macOS 14: Libraries require macOS 14.0 minimum
- macOS 15: Libraries require macOS 15.0 minimum

We cannot build wheels for older macOS versions than what the bundled
Homebrew libraries require. This means:
- macOS 13 runners: Build for macOS 13.0+ (HNSW) and 13.3+ (DiskANN)
- macOS 14 runners: Build for macOS 14.0+
- macOS 15 runners: Build for macOS 15.0+

This ensures delocate-wheel succeeds by matching deployment targets
with the actual minimum versions required by system libraries.

* fix: restore macOS 15 build matrix and correct test path

- Add back macOS 15 configurations for Python 3.9-3.13
- Fix pytest path from test/ to tests/ (correct directory name)

The macOS 15 support was accidentally missing from the matrix, and
pytest was looking for the wrong directory name.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-12 14:01:02 -07:00
11 changed files with 336 additions and 83 deletions

View File

@@ -64,6 +64,16 @@ jobs:
python: '3.12'
- os: macos-14
python: '3.13'
- os: macos-15
python: '3.9'
- os: macos-15
python: '3.10'
- os: macos-15
python: '3.11'
- os: macos-15
python: '3.12'
- os: macos-15
python: '3.13'
- os: macos-13
python: '3.9'
- os: macos-13
@@ -147,7 +157,14 @@ jobs:
# Use system clang for better compatibility
export CC=clang
export CXX=clang++
export MACOSX_DEPLOYMENT_TARGET=11.0
# Homebrew libraries on each macOS version require matching minimum version
if [[ "${{ matrix.os }}" == "macos-13" ]]; then
export MACOSX_DEPLOYMENT_TARGET=13.0
elif [[ "${{ matrix.os }}" == "macos-14" ]]; then
export MACOSX_DEPLOYMENT_TARGET=14.0
elif [[ "${{ matrix.os }}" == "macos-15" ]]; then
export MACOSX_DEPLOYMENT_TARGET=15.0
fi
uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
else
uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
@@ -161,7 +178,14 @@ jobs:
export CC=clang
export CXX=clang++
# DiskANN requires macOS 13.3+ for sgesdd_ LAPACK function
export MACOSX_DEPLOYMENT_TARGET=13.3
# But Homebrew libraries on each macOS version require matching minimum version
if [[ "${{ matrix.os }}" == "macos-13" ]]; then
export MACOSX_DEPLOYMENT_TARGET=13.3
elif [[ "${{ matrix.os }}" == "macos-14" ]]; then
export MACOSX_DEPLOYMENT_TARGET=14.0
elif [[ "${{ matrix.os }}" == "macos-15" ]]; then
export MACOSX_DEPLOYMENT_TARGET=15.0
fi
uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
else
uv build --wheel --python ${{ matrix.python }} --find-links ${GITHUB_WORKSPACE}/packages/leann-core/dist
@@ -197,10 +221,24 @@ jobs:
- name: Repair wheels (macOS)
if: runner.os == 'macOS'
run: |
# Determine deployment target based on runner OS
# Must match the Homebrew libraries for each macOS version
if [[ "${{ matrix.os }}" == "macos-13" ]]; then
HNSW_TARGET="13.0"
DISKANN_TARGET="13.3"
elif [[ "${{ matrix.os }}" == "macos-14" ]]; then
HNSW_TARGET="14.0"
DISKANN_TARGET="14.0"
elif [[ "${{ matrix.os }}" == "macos-15" ]]; then
HNSW_TARGET="15.0"
DISKANN_TARGET="15.0"
fi
# Repair HNSW wheel
cd packages/leann-backend-hnsw
if [ -d dist ]; then
delocate-wheel -w dist_repaired -v dist/*.whl
export MACOSX_DEPLOYMENT_TARGET=$HNSW_TARGET
delocate-wheel -w dist_repaired -v --require-target-macos-version $HNSW_TARGET dist/*.whl
rm -rf dist
mv dist_repaired dist
fi
@@ -209,7 +247,8 @@ jobs:
# Repair DiskANN wheel
cd packages/leann-backend-diskann
if [ -d dist ]; then
delocate-wheel -w dist_repaired -v dist/*.whl
export MACOSX_DEPLOYMENT_TARGET=$DISKANN_TARGET
delocate-wheel -w dist_repaired -v --require-target-macos-version $DISKANN_TARGET dist/*.whl
rm -rf dist
mv dist_repaired dist
fi
@@ -249,8 +288,8 @@ jobs:
# Activate virtual environment
source .venv/bin/activate || source .venv/Scripts/activate
# Run all tests
pytest tests/
# Run tests
pytest -v tests/
- name: Run sanity checks (optional)
run: |

View File

@@ -71,6 +71,8 @@ source .venv/bin/activate
uv pip install leann
```
> Low-resource? See “Low-resource setups” in the [Configuration Guide](docs/configuration-guide.md#low-resource-setups).
<details>
<summary>
<strong>🔧 Build from Source (Recommended for development)</strong>

View File

@@ -259,24 +259,80 @@ Every configuration choice involves trade-offs:
The key is finding the right balance for your specific use case. Start small and simple, measure performance, then scale up only where needed.
## Deep Dive: Critical Configuration Decisions
## Low-resource setups
### When to Disable Recomputation
If you dont have a local GPU or builds/searches are too slow, use one or more of the options below.
LEANN's recomputation feature provides exact distance calculations but can be disabled for extreme QPS requirements:
### 1) Use OpenAI embeddings (no local compute)
Fastest path with zero local GPU requirements. Set your API key and use OpenAI embeddings during build and search:
```bash
--no-recompute # Disable selective recomputation
export OPENAI_API_KEY=sk-...
# Build with OpenAI embeddings
leann build my-index \
--embedding-mode openai \
--embedding-model text-embedding-3-small
# Search with OpenAI embeddings (recompute at query time)
leann search my-index "your query" \
--recompute-embeddings
```
**Trade-offs**:
- **With recomputation** (default): Exact distances, best quality, higher latency, minimal storage (only stores metadata, recomputes embeddings on-demand)
- **Without recomputation**: Must store full embeddings, significantly higher memory and storage usage (10-100x more), but faster search
### 2) Run remote builds with SkyPilot (cloud GPU)
**Disable when**:
- You have abundant storage and memory
- Need extremely low latency (< 100ms)
- Running a read-heavy workload where storage cost is acceptable
Offload embedding generation and index building to a GPU VM using SkyPilot. A template is provided at `sky/leann-build.yaml`.
```bash
# One-time: install and configure SkyPilot
pip install skypilot
sky launch -c leann-gpu sky/leann-build.yaml
# Build remotely (template installs uv + leann CLI)
sky exec leann-gpu -- "leann build my-index --docs ~/leann-data --backend hnsw --complexity 64 --graph-degree 32"
```
Details: see “Running Builds on SkyPilot (Optional)” below.
### 3) Disable recomputation to trade storage for speed
If you need lower latency and have more storage/memory, disable recomputation. This stores full embeddings and avoids recomputing at search time.
```bash
# Build without recomputation (HNSW requires non-compact in this mode)
leann build my-index --no-recompute --no-compact
# Search without recomputation
leann search my-index "your query" --no-recompute
```
Trade-offs: lower query-time latency, but significantly higher storage usage.
## Running Builds on SkyPilot (Optional)
You can offload embedding generation and index building to a cloud GPU VM using SkyPilot, without changing any LEANN code. This is useful when your local machine lacks a GPU or you want faster throughput.
### Quick Start
1) Install SkyPilot by following their docs (`pip install skypilot`), then configure cloud credentials.
2) Use the provided SkyPilot template:
```bash
sky launch -c leann-gpu sky/leann-build.yaml
```
3) On the remote, either put your data under the mounted path or adjust `file_mounts` in `sky/leann-build.yaml`. Then run the LEANN build:
```bash
sky exec leann-gpu -- "leann build my-index --docs ~/leann-data --backend hnsw --complexity 64 --graph-degree 32"
```
Notes:
- The template installs `uv` and the `leann` CLI globally on the remote instance.
- Change the `accelerators` and `cloud` settings in `sky/leann-build.yaml` to match your budget/availability (e.g., `A10G:1`, `A100:1`, or CPU-only if you prefer).
- You can also build with `diskann` by switching `--backend diskann`.
## Further Reading

View File

@@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build"
[project]
name = "leann-backend-diskann"
version = "0.2.8"
dependencies = ["leann-core==0.2.8", "numpy", "protobuf>=3.19.0"]
version = "0.2.9"
dependencies = ["leann-core==0.2.9", "numpy", "protobuf>=3.19.0"]
[tool.scikit-build]
# Key: simplified CMake path

View File

@@ -95,6 +95,8 @@ def create_hnsw_embedding_server(
passage_sources.append(source_copy)
passages = PassageManager(passage_sources)
# Use index dimensions from metadata for shaping fallback responses
embedding_dim: int = int(meta.get("dimensions", 0))
logger.info(
f"Loaded PassageManager with {len(passages.global_offset_map)} passages from metadata"
)
@@ -109,6 +111,9 @@ def create_hnsw_embedding_server(
socket.setsockopt(zmq.RCVTIMEO, 300000)
socket.setsockopt(zmq.SNDTIMEO, 300000)
# Track last request type for safe fallback responses on exceptions
last_request_type = "unknown" # one of: 'text', 'distance', 'embedding', 'unknown'
last_request_length = 0
while True:
try:
message_bytes = socket.recv()
@@ -121,6 +126,8 @@ def create_hnsw_embedding_server(
if isinstance(request_payload, list) and len(request_payload) > 0:
# Check if this is a direct text request (list of strings)
if all(isinstance(item, str) for item in request_payload):
last_request_type = "text"
last_request_length = len(request_payload)
logger.info(
f"Processing direct text embedding request for {len(request_payload)} texts in {embedding_mode} mode"
)
@@ -145,43 +152,66 @@ def create_hnsw_embedding_server(
):
node_ids = request_payload[0]
query_vector = np.array(request_payload[1], dtype=np.float32)
last_request_type = "distance"
last_request_length = len(node_ids)
logger.debug("Distance calculation request received")
logger.debug(f" Node IDs: {node_ids}")
logger.debug(f" Query vector dim: {len(query_vector)}")
# Get embeddings for node IDs
texts = []
for nid in node_ids:
# Get embeddings for node IDs, tolerate missing IDs
texts: list[str] = []
found_indices: list[int] = []
for idx, nid in enumerate(node_ids):
try:
passage_data = passages.get_passage(str(nid))
txt = passage_data["text"]
texts.append(txt)
txt = passage_data.get("text", "")
if isinstance(txt, str) and len(txt) > 0:
texts.append(txt)
found_indices.append(idx)
else:
logger.error(f"Empty text for passage ID {nid}")
except KeyError:
logger.error(f"Passage ID {nid} not found")
raise RuntimeError(f"FATAL: Passage with ID {nid} not found")
except Exception as e:
logger.error(f"Exception looking up passage ID {nid}: {e}")
raise
# Process embeddings
embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
logger.info(
f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
# Prepare full-length response distances with safe fallbacks
large_distance = 1e9
response_distances = [large_distance] * len(node_ids)
if texts:
try:
# Process embeddings only for found indices
embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
logger.info(
f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
)
# Calculate distances for found embeddings only
if distance_metric == "l2":
partial_distances = np.sum(
np.square(embeddings - query_vector.reshape(1, -1)), axis=1
)
else: # mips or cosine
partial_distances = -np.dot(embeddings, query_vector)
# Place computed distances back into the full response array
for pos, dval in zip(
found_indices, partial_distances.flatten().tolist()
):
response_distances[pos] = float(dval)
except Exception as e:
logger.error(
f"Distance computation error, falling back to large distances: {e}"
)
# Always reply with exactly len(node_ids) distances
response_bytes = msgpack.packb([response_distances], use_single_float=True)
logger.debug(
f"Sending distance response with {len(response_distances)} distances (found={len(found_indices)})"
)
# Calculate distances
if distance_metric == "l2":
distances = np.sum(
np.square(embeddings - query_vector.reshape(1, -1)), axis=1
)
else: # mips or cosine
distances = -np.dot(embeddings, query_vector)
response_payload = distances.flatten().tolist()
response_bytes = msgpack.packb([response_payload], use_single_float=True)
logger.debug(f"Sending distance response with {len(distances)} distances")
socket.send(response_bytes)
e2e_end = time.time()
logger.info(f"⏱️ Distance calculation E2E time: {e2e_end - e2e_start:.6f}s")
@@ -201,40 +231,61 @@ def create_hnsw_embedding_server(
node_ids = request_payload[0]
logger.debug(f"Request for {len(node_ids)} node embeddings")
last_request_type = "embedding"
last_request_length = len(node_ids)
# Look up texts by node IDs
texts = []
for nid in node_ids:
# Allocate output buffer (B, D) and fill with zeros for robustness
if embedding_dim <= 0:
logger.error("Embedding dimension unknown; cannot serve embedding request")
dims = [0, 0]
data = []
else:
dims = [len(node_ids), embedding_dim]
data = [0.0] * (dims[0] * dims[1])
# Look up texts by node IDs; compute embeddings where available
texts: list[str] = []
found_indices: list[int] = []
for idx, nid in enumerate(node_ids):
try:
passage_data = passages.get_passage(str(nid))
txt = passage_data["text"]
if not txt:
raise RuntimeError(f"FATAL: Empty text for passage ID {nid}")
texts.append(txt)
txt = passage_data.get("text", "")
if isinstance(txt, str) and len(txt) > 0:
texts.append(txt)
found_indices.append(idx)
else:
logger.error(f"Empty text for passage ID {nid}")
except KeyError:
raise RuntimeError(f"FATAL: Passage with ID {nid} not found")
logger.error(f"Passage with ID {nid} not found")
except Exception as e:
logger.error(f"Exception looking up passage ID {nid}: {e}")
raise
# Process embeddings
embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
logger.info(
f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
)
if texts:
try:
# Process embeddings for found texts only
embeddings = compute_embeddings(texts, model_name, mode=embedding_mode)
logger.info(
f"Computed embeddings for {len(texts)} texts, shape: {embeddings.shape}"
)
# Serialization and response
if np.isnan(embeddings).any() or np.isinf(embeddings).any():
logger.error(
f"NaN or Inf detected in embeddings! Requested IDs: {node_ids[:5]}..."
)
raise AssertionError()
if np.isnan(embeddings).any() or np.isinf(embeddings).any():
logger.error(
f"NaN or Inf detected in embeddings! Requested IDs: {node_ids[:5]}..."
)
dims = [0, embedding_dim]
data = []
else:
# Copy computed embeddings into the correct positions
emb_f32 = np.ascontiguousarray(embeddings, dtype=np.float32)
flat = emb_f32.flatten().tolist()
for j, pos in enumerate(found_indices):
start = pos * embedding_dim
end = start + embedding_dim
data[start:end] = flat[j * embedding_dim : (j + 1) * embedding_dim]
except Exception as e:
logger.error(f"Embedding computation error, returning zeros: {e}")
hidden_contiguous_f32 = np.ascontiguousarray(embeddings, dtype=np.float32)
response_payload = [
list(hidden_contiguous_f32.shape),
hidden_contiguous_f32.flatten().tolist(),
]
response_payload = [dims, data]
response_bytes = msgpack.packb(response_payload, use_single_float=True)
socket.send(response_bytes)
@@ -249,7 +300,22 @@ def create_hnsw_embedding_server(
import traceback
traceback.print_exc()
socket.send(msgpack.packb([[], []]))
# Fallback to a safe, minimal-structure response to avoid client crashes
if last_request_type == "distance":
# Return a vector of large distances with the expected length
fallback_len = max(0, int(last_request_length))
large_distance = 1e9
safe_response = [[large_distance] * fallback_len]
elif last_request_type == "embedding":
# Return an empty embedding block with known dimension if available
if embedding_dim > 0:
safe_response = [[0, embedding_dim], []]
else:
safe_response = [[0, 0], []]
else:
# Unknown request type: default to empty embedding structure
safe_response = [[0, int(embedding_dim) if embedding_dim > 0 else 0], []]
socket.send(msgpack.packb(safe_response, use_single_float=True))
zmq_thread = threading.Thread(target=zmq_server_thread, daemon=True)
zmq_thread.start()

View File

@@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build"
[project]
name = "leann-backend-hnsw"
version = "0.2.8"
version = "0.2.9"
description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
dependencies = [
"leann-core==0.2.8",
"leann-core==0.2.9",
"numpy",
"pyzmq>=23.0.0",
"msgpack>=1.0.0",

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "leann-core"
version = "0.2.8"
version = "0.2.9"
description = "Core API and plugin system for LEANN"
readme = "README.md"
requires-python = ">=3.9"

View File

@@ -117,7 +117,19 @@ Examples:
build_parser.add_argument("--complexity", type=int, default=64)
build_parser.add_argument("--num-threads", type=int, default=1)
build_parser.add_argument("--compact", action="store_true", default=True)
build_parser.add_argument(
"--no-compact",
dest="compact",
action="store_false",
help="Disable compact index storage (store full embeddings; higher storage)",
)
build_parser.add_argument("--recompute", action="store_true", default=True)
build_parser.add_argument(
"--no-recompute",
dest="recompute",
action="store_false",
help="Disable embedding recomputation (store full embeddings; lower query latency)",
)
build_parser.add_argument(
"--file-types",
type=str,
@@ -138,6 +150,18 @@ Examples:
default=True,
help="Recompute embeddings (default: True)",
)
search_parser.add_argument(
"--no-recompute-embeddings",
dest="recompute_embeddings",
action="store_false",
help="Disable embedding recomputation during search",
)
search_parser.add_argument(
"--no-recompute",
dest="recompute_embeddings",
action="store_false",
help="Alias for --no-recompute-embeddings",
)
search_parser.add_argument(
"--pruning-strategy",
choices=["global", "local", "proportional"],
@@ -166,6 +190,18 @@ Examples:
default=True,
help="Recompute embeddings (default: True)",
)
ask_parser.add_argument(
"--no-recompute-embeddings",
dest="recompute_embeddings",
action="store_false",
help="Disable embedding recomputation during ask",
)
ask_parser.add_argument(
"--no-recompute",
dest="recompute_embeddings",
action="store_false",
help="Alias for --no-recompute-embeddings",
)
ask_parser.add_argument(
"--pruning-strategy",
choices=["global", "local", "proportional"],

View File

@@ -4,20 +4,12 @@ Transform your development workflow with intelligent code assistance using LEANN
## Prerequisites
**Step 1:** First, complete the basic LEANN installation following the [📦 Installation guide](../../README.md#installation) in the root README:
Install LEANN globally for MCP integration (with default backend):
```bash
uv venv
source .venv/bin/activate
uv pip install leann
uv tool install leann-core --with leann
```
**Step 2:** Install LEANN globally for MCP integration:
```bash
uv tool install leann-core
```
This makes the `leann` command available system-wide, which `leann_mcp` requires.
This installs the `leann` CLI into an isolated tool environment and includes both backends so `leann build` works out-of-the-box.
## 🚀 Quick Setup

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "leann"
version = "0.2.8"
version = "0.2.9"
description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
readme = "README.md"
requires-python = ">=3.9"

62
sky/leann-build.yaml Normal file
View File

@@ -0,0 +1,62 @@
name: leann-build
resources:
# Choose a GPU for fast embeddings (examples: L4, A10G, A100). CPU also works but is slower.
accelerators: L4:1
# Optionally pin a cloud, otherwise SkyPilot will auto-select
# cloud: aws
disk_size: 100
env:
# Build parameters (override with: sky launch -c leann-gpu sky/leann-build.yaml -e key=value)
index_name: my-index
docs: ./data
backend: hnsw # hnsw | diskann
complexity: 64
graph_degree: 32
num_threads: 8
# Embedding selection
embedding_mode: sentence-transformers # sentence-transformers | openai | mlx | ollama
embedding_model: facebook/contriever
# Storage/latency knobs
recompute: true # true => selective recomputation; false => store full embeddings
compact: true # for HNSW only: false when recompute=false
# Optional pass-through
extra_args: ""
# Sync local paths to the remote VM. Adjust as needed.
file_mounts:
# Example: mount your local data directory used for building
~/leann-data: ${docs}
setup: |
set -e
# Install uv (package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
# Install the LEANN CLI globally on the remote machine
uv tool install leann
run: |
export PATH="$HOME/.local/bin:$PATH"
# Derive flags from env
recompute_flag=""
if [ "${recompute}" = "false" ] || [ "${recompute}" = "0" ]; then
recompute_flag="--no-recompute"
fi
compact_flag=""
if [ "${compact}" = "false" ] || [ "${compact}" = "0" ]; then
compact_flag="--no-compact"
fi
# Build command
leann build ${index_name} \
--docs ~/leann-data \
--backend ${backend} \
--complexity ${complexity} \
--graph-degree ${graph_degree} \
--num-threads ${num_threads} \
--embedding-mode ${embedding_mode} \
--embedding-model ${embedding_model} \
${recompute_flag} ${compact_flag} ${extra_args}