Compare commits

..

4 Commits

Author SHA1 Message Date
GitHub Actions
738f1dbab8 chore: release v0.3.1 2025-08-19 05:56:45 +00:00
yichuan520030910320
37d990d51c [feature] fix cli 2025-08-18 22:55:43 -07:00
Andy Lee
a6f07a54f1 fix: Use uv venv for Arch Linux CI wheel installation (#69)
- Use astral-sh/setup-uv@v4 action for consistency with other jobs
- Create virtual environment with uv venv to bypass PEP 668 restrictions
- Install wheels using uv pip install for faster dependency resolution
- Maintain tool consistency across the entire CI pipeline

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-16 21:32:19 -07:00
Andy Lee
46905e0687 feat: Improve DiskANN cross-platform compatibility and add Arch Linux support (#66)
* feat: Enhance CLI with improved list and smart remove commands

##  New Features

### 🏠 Enhanced `leann list` command
- **Better UX**: Current project shown first with clear separation
- **Visual improvements**: Icons (🏠/📂), better formatting, size info
- **Smart guidance**: Context-aware usage examples and getting started tips

### 🛡️ Smart `leann remove` command
- **Safety first**: Always shows ALL matching indexes across projects
- **Intelligent handling**:
  - Single match: Clear location display with cross-project warnings
  - Multiple matches: Interactive selection with final confirmation
- **Prevents accidents**: No more deleting wrong indexes due to name conflicts
- **User-friendly**: 'c' to cancel, clear visual hierarchy, detailed info

### 🔧 Technical improvements
- **Clean logging**: Hide debug messages for better CLI experience
- **Comprehensive search**: Always scan all projects for transparency
- **Error handling**: Graceful handling of edge cases and user input

## 🎯 Impact
- **Safer**: Eliminates risk of accidental index deletion
- **Clearer**: Users always know what they're operating on
- **Smarter**: Automatic detection and handling of common scenarios

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: vscode ruff, and format

* fix: Update DiskANN submodule with MKL linking improvements

Updates DiskANN submodule to include fix for MKL linking issues:
- Replaces global link_libraries() with target-specific linking
- Uses dynamic MKL linking (mkl_rt) for better cross-platform compatibility
- Prevents MKL contamination of unrelated targets (like zlib tests)
- Resolves build failures on strict linkers (Arch Linux) while maintaining Ubuntu compatibility

DiskANN commit: c593831 - fix: Replace global MKL linking with target-specific approach

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: all linux deps

* fix: Update Intel MKL download link to avoid 403 error

- Replace problematic Intel download URL that returns 403 Forbidden
- Use general Intel oneAPI MKL page instead of specific download parameters
- This fixes the lychee link checker CI failure

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Configure lychee to use browser User-Agent for Intel links

- Replace domain exclusion with browser User-Agent to properly check Intel links
- Intel website blocks automated tools but allows browser-like requests
- This enables proper link validation while avoiding 403 Forbidden errors

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: Use curl User-Agent for lychee link checking

Intel website has specific anti-bot logic:
- Blocks browser User-Agents (returns 403)
- Blocks lychee default User-Agent (returns 403)
- Allows curl User-Agent (returns 200)

This enables proper link validation for Intel documentation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-16 14:42:20 -07:00
7 changed files with 3756 additions and 3809 deletions

View File

@@ -183,6 +183,9 @@ class Benchmark:
start_time = time.time()
with torch.no_grad():
self.model(input_ids=input_ids, attention_mask=attention_mask)
# mps sync
if torch.backends.mps.is_available():
torch.mps.synchronize()
end_time = time.time()
return end_time - start_time

View File

@@ -4,8 +4,8 @@ build-backend = "scikit_build_core.build"
[project]
name = "leann-backend-diskann"
version = "0.3.0"
dependencies = ["leann-core==0.3.0", "numpy", "protobuf>=3.19.0"]
version = "0.3.1"
dependencies = ["leann-core==0.3.1", "numpy", "protobuf>=3.19.0"]
[tool.scikit-build]
# Key: simplified CMake path

View File

@@ -6,10 +6,10 @@ build-backend = "scikit_build_core.build"
[project]
name = "leann-backend-hnsw"
version = "0.3.0"
version = "0.3.1"
description = "Custom-built HNSW (Faiss) backend for the Leann toolkit."
dependencies = [
"leann-core==0.3.0",
"leann-core==0.3.1",
"numpy",
"pyzmq>=23.0.0",
"msgpack>=1.0.0",

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "leann-core"
version = "0.3.0"
version = "0.3.1"
description = "Core API and plugin system for LEANN"
readme = "README.md"
requires-python = ">=3.9"

View File

@@ -405,13 +405,9 @@ Examples:
print("💡 Get started:")
print(" leann build my-docs --docs ./documents")
else:
projects_count = len(
[
p
for p in valid_projects
if (p / ".leann" / "indexes").exists()
and list((p / ".leann" / "indexes").iterdir())
]
# Count only projects that have at least one discoverable index
projects_count = sum(
1 for p in valid_projects if len(self._discover_indexes_in_project(p)) > 0
)
print(f"📊 Total: {total_indexes} indexes across {projects_count} projects")
@@ -461,26 +457,35 @@ Examples:
)
# 2. Apps format: *.leann.meta.json files anywhere in the project
cli_indexes_dir = project_path / ".leann" / "indexes"
for meta_file in project_path.rglob("*.leann.meta.json"):
if meta_file.is_file():
# Extract index name from filename (remove .leann.meta.json extension)
index_name = meta_file.name.replace(".leann.meta.json", "")
# Skip CLI-built indexes (which store meta under .leann/indexes/<name>/)
try:
if cli_indexes_dir.exists() and cli_indexes_dir in meta_file.parents:
continue
except Exception:
pass
# Use the parent directory name as the app index display name
display_name = meta_file.parent.name
# Extract file base used to store files
file_base = meta_file.name.replace(".leann.meta.json", "")
# Apps indexes are considered complete if the .leann.meta.json file exists
status = ""
# Calculate total size of all related files
# Calculate total size of all related files (use file base)
size_mb = 0
try:
index_dir = meta_file.parent
for related_file in index_dir.glob(f"{index_name}.leann*"):
for related_file in index_dir.glob(f"{file_base}.leann*"):
size_mb += related_file.stat().st_size / (1024 * 1024)
except (OSError, PermissionError):
pass
indexes.append(
{
"name": index_name,
"name": display_name,
"type": "app",
"status": status,
"size_mb": size_mb,
@@ -534,13 +539,79 @@ Examples:
if not project_path.exists():
continue
# 1) CLI-format index under .leann/indexes/<name>
index_dir = project_path / ".leann" / "indexes" / index_name
if index_dir.exists():
is_current = project_path == current_path
matches.append(
{"project_path": project_path, "index_dir": index_dir, "is_current": is_current}
{
"project_path": project_path,
"index_dir": index_dir,
"is_current": is_current,
"kind": "cli",
}
)
# 2) App-format indexes
# We support two ways of addressing apps:
# a) by the file base (e.g., `pdf_documents`)
# b) by the parent directory name (e.g., `new_txt`)
seen_app_meta = set()
# 2a) by file base
for meta_file in project_path.rglob(f"{index_name}.leann.meta.json"):
if meta_file.is_file():
# Skip CLI-built indexes' meta under .leann/indexes
try:
cli_indexes_dir = project_path / ".leann" / "indexes"
if cli_indexes_dir.exists() and cli_indexes_dir in meta_file.parents:
continue
except Exception:
pass
is_current = project_path == current_path
key = (str(project_path), str(meta_file))
if key in seen_app_meta:
continue
seen_app_meta.add(key)
matches.append(
{
"project_path": project_path,
"files_dir": meta_file.parent,
"meta_file": meta_file,
"is_current": is_current,
"kind": "app",
"display_name": meta_file.parent.name,
"file_base": meta_file.name.replace(".leann.meta.json", ""),
}
)
# 2b) by parent directory name
for meta_file in project_path.rglob("*.leann.meta.json"):
if meta_file.is_file() and meta_file.parent.name == index_name:
# Skip CLI-built indexes' meta under .leann/indexes
try:
cli_indexes_dir = project_path / ".leann" / "indexes"
if cli_indexes_dir.exists() and cli_indexes_dir in meta_file.parents:
continue
except Exception:
pass
is_current = project_path == current_path
key = (str(project_path), str(meta_file))
if key in seen_app_meta:
continue
seen_app_meta.add(key)
matches.append(
{
"project_path": project_path,
"files_dir": meta_file.parent,
"meta_file": meta_file,
"is_current": is_current,
"kind": "app",
"display_name": meta_file.parent.name,
"file_base": meta_file.name.replace(".leann.meta.json", ""),
}
)
# Sort: current project first, then by project name
matches.sort(key=lambda x: (not x["is_current"], x["project_path"].name))
return matches
@@ -548,8 +619,8 @@ Examples:
def _remove_single_match(self, match, index_name: str, force: bool):
"""Handle removal when only one match is found"""
project_path = match["project_path"]
index_dir = match["index_dir"]
is_current = match["is_current"]
kind = match.get("kind", "cli")
if is_current:
location_info = "current project"
@@ -560,7 +631,10 @@ Examples:
print(f"✅ Found 1 index named '{index_name}':")
print(f" {emoji} Location: {location_info}")
print(f" 📍 Path: {project_path}")
if kind == "cli":
print(f" 📍 Path: {project_path / '.leann' / 'indexes' / index_name}")
else:
print(f" 📍 Meta: {match['meta_file']}")
if not force:
if not is_current:
@@ -572,9 +646,22 @@ Examples:
print(" ❌ Removal cancelled.")
return False
return self._delete_index_directory(
index_dir, index_name, project_path if not is_current else None
)
if kind == "cli":
return self._delete_index_directory(
match["index_dir"],
index_name,
project_path if not is_current else None,
is_app=False,
)
else:
return self._delete_index_directory(
match["files_dir"],
match.get("display_name", index_name),
project_path if not is_current else None,
is_app=True,
meta_file=match.get("meta_file"),
app_file_base=match.get("file_base"),
)
def _remove_from_multiple_matches(self, matches, index_name: str, force: bool):
"""Handle removal when multiple matches are found"""
@@ -585,19 +672,34 @@ Examples:
for i, match in enumerate(matches, 1):
project_path = match["project_path"]
is_current = match["is_current"]
kind = match.get("kind", "cli")
if is_current:
print(f" {i}. 🏠 Current project")
print(f" 📍 {project_path}")
print(f" {i}. 🏠 Current project ({'CLI' if kind == 'cli' else 'APP'})")
else:
print(f" {i}. 📂 {project_path.name}")
print(f" 📍 {project_path}")
print(f" {i}. 📂 {project_path.name} ({'CLI' if kind == 'cli' else 'APP'})")
# Show path details
if kind == "cli":
print(f" 📍 {project_path / '.leann' / 'indexes' / index_name}")
else:
print(f" 📍 {match['meta_file']}")
# Show size info
try:
size_mb = sum(
f.stat().st_size for f in match["index_dir"].iterdir() if f.is_file()
) / (1024 * 1024)
if kind == "cli":
size_mb = sum(
f.stat().st_size for f in match["index_dir"].iterdir() if f.is_file()
) / (1024 * 1024)
else:
file_base = match.get("file_base")
size_mb = 0.0
if file_base:
size_mb = sum(
f.stat().st_size
for f in match["files_dir"].glob(f"{file_base}.leann*")
if f.is_file()
) / (1024 * 1024)
print(f" 📦 Size: {size_mb:.1f} MB")
except (OSError, PermissionError):
pass
@@ -621,8 +723,8 @@ Examples:
if 0 <= choice_idx < len(matches):
selected_match = matches[choice_idx]
project_path = selected_match["project_path"]
index_dir = selected_match["index_dir"]
is_current = selected_match["is_current"]
kind = selected_match.get("kind", "cli")
location = "current project" if is_current else f"'{project_path.name}' project"
print(f" 🎯 Selected: Remove from {location}")
@@ -635,9 +737,22 @@ Examples:
print(" ❌ Confirmation failed. Removal cancelled.")
return False
return self._delete_index_directory(
index_dir, index_name, project_path if not is_current else None
)
if kind == "cli":
return self._delete_index_directory(
selected_match["index_dir"],
index_name,
project_path if not is_current else None,
is_app=False,
)
else:
return self._delete_index_directory(
selected_match["files_dir"],
selected_match.get("display_name", index_name),
project_path if not is_current else None,
is_app=True,
meta_file=selected_match.get("meta_file"),
app_file_base=selected_match.get("file_base"),
)
else:
print(" ❌ Invalid choice. Removal cancelled.")
return False
@@ -647,21 +762,65 @@ Examples:
return False
def _delete_index_directory(
self, index_dir: Path, index_name: str, project_path: Optional[Path] = None
self,
index_dir: Path,
index_display_name: str,
project_path: Optional[Path] = None,
is_app: bool = False,
meta_file: Optional[Path] = None,
app_file_base: Optional[str] = None,
):
"""Actually delete the index directory"""
"""Delete a CLI index directory or APP index files safely."""
try:
import shutil
if is_app:
removed = 0
errors = 0
# Delete only files that belong to this app index (based on file base)
pattern_base = app_file_base or ""
for f in index_dir.glob(f"{pattern_base}.leann*"):
try:
f.unlink()
removed += 1
except Exception:
errors += 1
# Best-effort: also remove the meta file if specified and still exists
if meta_file and meta_file.exists():
try:
meta_file.unlink()
removed += 1
except Exception:
errors += 1
shutil.rmtree(index_dir)
if project_path:
print(f"Index '{index_name}' removed from {project_path.name}")
if removed > 0 and errors == 0:
if project_path:
print(
f"App index '{index_display_name}' removed from {project_path.name}"
)
else:
print(f"✅ App index '{index_display_name}' removed successfully")
return True
elif removed > 0 and errors > 0:
print(
f"⚠️ App index '{index_display_name}' partially removed (some files couldn't be deleted)"
)
return True
else:
print(
f"❌ No files found to remove for app index '{index_display_name}' in {index_dir}"
)
return False
else:
print(f"✅ Index '{index_name}' removed successfully")
return True
import shutil
shutil.rmtree(index_dir)
if project_path:
print(f"✅ Index '{index_display_name}' removed from {project_path.name}")
else:
print(f"✅ Index '{index_display_name}' removed successfully")
return True
except Exception as e:
print(f"❌ Error removing index '{index_name}': {e}")
print(f"❌ Error removing index '{index_display_name}': {e}")
return False
def load_documents(

View File

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "leann"
version = "0.3.0"
version = "0.3.1"
description = "LEANN - The smallest vector index in the world. RAG Everything with LEANN!"
readme = "README.md"
requires-python = ">=3.9"

7313
uv.lock generated
View File

File diff suppressed because it is too large Load Diff