LEANN

Author	SHA1	Message	Date
Alex	17cbd07b25	Add Anthropic LLM support (#185 ) * Add Anthropic LLM support Signed-off-by: droctothorpe <mythicalsunlight@gmail.com> * Update skypilot link Signed-off-by: droctothorpe <mythicalsunlight@gmail.com> * Handle anthropic base_url Signed-off-by: droctothorpe <mythicalsunlight@gmail.com> * Address ruff format finding Signed-off-by: droctothorpe <mythicalsunlight@gmail.com> --------- Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>	2025-12-12 10:53:41 -08:00
Aakash Suresh	e268392d5b	Fix: Prevent duplicate PDF processing when using --file-types .pdf (#179 ) Fixes #175 Problem: When --file-types .pdf is specified, PDFs were being processed twice: 1. Separately with PyMuPDF/pdfplumber extractors 2. Again in the 'other file types' section via SimpleDirectoryReader This caused duplicate processing and potential conflicts. Solution: - Exclude .pdf from other_file_extensions when PDFs are already processed separately - Only load other file types if there are extensions to process - Prevents duplicate PDF processing Changes: - Added logic to filter out .pdf from code_extensions when loading other file types if PDFs were processed separately - Updated SimpleDirectoryReader to use filtered extensions - Added check to skip loading if no other extensions to process	2025-12-01 13:48:44 -08:00
ww26	969f514564	Fix prompt template bugs: build template ignored and runtime override not wired (#173 ) * Fix prompt template bugs in build and search Bug 1: Build template ignored in new format - Updated compute_embeddings_openai() to read build_prompt_template or prompt_template - Updated compute_embeddings_ollama() with same fix - Maintains backward compatibility with old single-template format Bug 2: Runtime override not wired up - Wired CLI search to pass provider_options to searcher.search() - Enables runtime template override during search via --embedding-prompt-template All 42 prompt template tests passing. Fixes #155 * Fix: Prevent embedding server from applying templates during search - Filter out all prompt templates (build_prompt_template, query_prompt_template, prompt_template) from provider_options when launching embedding server during search - Templates are already applied in compute_query_embedding() before server call - Prevents double-templating and ensures runtime override works correctly This fixes the issue where --embedding-prompt-template during search was ignored because the server was applying build_prompt_template instead. * Format code with ruff	2025-11-16 20:56:42 -08:00
ww26	1ef9cba7de	Feature/prompt templates and lmstudio sdk (#171 ) * Add prompt template support and LM Studio SDK integration Features: - Prompt template support for embedding models (via --embedding-prompt-template) - LM Studio SDK integration for automatic context length detection - Hybrid token limit discovery (Ollama → LM Studio → Registry → Default) - Client-side token truncation to prevent silent failures - Automatic persistence of embedding_options to .meta.json Implementation: - Added _query_lmstudio_context_limit() with Node.js subprocess bridge - Modified compute_embeddings_openai() to apply prompt templates before truncation - Extended CLI with --embedding-prompt-template flag for build and search - URL detection for LM Studio (port 1234 or lmstudio/lm.studio keywords) - HTTP→WebSocket URL conversion for SDK compatibility Tests: - 60 passing tests across 5 test files - Comprehensive coverage of prompt templates, LM Studio integration, and token handling - Parametrized tests for maintainability and clarity * Add integration tests and fix LM Studio SDK bridge Features: - End-to-end integration tests for prompt template with EmbeddingGemma - Integration tests for hybrid token limit discovery mechanism - Tests verify real-world functionality with live services (LM Studio, Ollama) Fixes: - LM Studio SDK bridge now uses client.embedding.load() for embedding models - Fixed NODE_PATH resolution to include npm global modules - Fixed integration test to use WebSocket URL (ws://) for SDK bridge Tests: - test_prompt_template_e2e.py: 8 integration tests covering: - Prompt template prepending with LM Studio (EmbeddingGemma) - LM Studio SDK bridge for context length detection - Ollama dynamic token limit detection - Hybrid discovery fallback mechanism (registry, default) - All tests marked with @pytest.mark.integration for selective execution - Tests gracefully skip when services unavailable Documentation: - Updated tests/README.md with integration test section - Added prerequisites and running instructions - Documented that prompt templates are ONLY for EmbeddingGemma - Added integration marker to pyproject.toml Test Results: - All 8 integration tests passing with live services - Confirmed prompt templates work correctly with EmbeddingGemma - Verified LM Studio SDK bridge auto-detects context length (2048) - Validated hybrid token limit discovery across all backends * Add prompt template support to Ollama mode Extends prompt template functionality from OpenAI mode to Ollama for backend consistency. Changes: - Add provider_options parameter to compute_embeddings_ollama() - Apply prompt template before token truncation (lines 1005-1011) - Pass provider_options through compute_embeddings() call chain Tests: - test_ollama_embedding_with_prompt_template: Verifies templates work with Ollama - test_ollama_prompt_template_affects_embeddings: Confirms embeddings differ with/without template - Both tests pass with live Ollama service (2/2 passing) Usage: leann build --embedding-mode ollama --embedding-prompt-template "query: " ... * Fix LM Studio SDK bridge to respect JIT auto-evict settings Problem: SDK bridge called client.embedding.load() which loaded models into LM Studio memory and bypassed JIT auto-evict settings, causing duplicate model instances to accumulate. Root cause analysis (from Perplexity research): - Explicit SDK load() commands are treated as "pinned" models - JIT auto-evict only applies to models loaded reactively via API requests - SDK-loaded models remain in memory until explicitly unloaded Solutions implemented: 1. Add model.unload() after metadata query (line 243) - Load model temporarily to get context length - Unload immediately to hand control back to JIT system - Subsequent API requests trigger JIT load with auto-evict 2. Add token limit caching to prevent repeated SDK calls - Cache discovered limits in _token_limit_cache dict (line 48) - Key: (model_name, base_url), Value: token_limit - Prevents duplicate load/unload cycles within same process - Cache shared across all discovery methods (Ollama, SDK, registry) Tests: - TestTokenLimitCaching: 5 tests for cache behavior (integrated into test_token_truncation.py) - Manual testing confirmed no duplicate models in LM Studio after fix - All existing tests pass Impact: - Respects user's LM Studio JIT and auto-evict settings - Reduces model memory footprint - Faster subsequent builds (cached limits) * Document prompt template and LM Studio SDK features Added comprehensive documentation for new optional embedding features: Configuration Guide (docs/configuration-guide.md): - New section: "Optional Embedding Features" - Task-Specific Prompt Templates subsection: - Explains EmbeddingGemma use case with document/query prompts - CLI and Python API examples - Clear warnings about compatible vs incompatible models - References to GitHub issue #155 and HuggingFace blog - LM Studio Auto-Detection subsection: - Prerequisites (Node.js + @lmstudio/sdk) - How auto-detection works (4-step process) - Benefits and optional nature clearly stated FAQ (docs/faq.md): - FAQ #2: When should I use prompt templates? - DO/DON'T guidance with examples - Links to detailed configuration guide - FAQ #3: Why is LM Studio loading multiple copies? - Explains the JIT auto-evict fix - Troubleshooting steps if still seeing issues - FAQ #4: Do I need Node.js and @lmstudio/sdk? - Clarifies it's completely optional - Lists benefits if installed - Installation instructions Cross-references between documents for easy navigation between quick reference and detailed guides. * Add separate build/query template support for task-specific models Task-specific models like EmbeddingGemma require different templates for indexing vs searching. Store both templates at build time and auto-apply query template during search with backward compatibility. * Consolidate prompt template tests from 44 to 37 tests Merged redundant no-op tests, removed low-value implementation tests, consolidated parameterized CLI tests, and removed hanging over-mocked test. All tests pass with improved focus on behavioral testing. * Fix query template application in compute_query_embedding Query templates were only applied in the fallback code path, not when using the embedding server (default path). This meant stored query templates in index metadata were ignored during MCP and CLI searches. Changes: - Move template application to before any computation path (searcher_base.py:109-110) - Add comprehensive tests for both server and fallback paths - Consolidate tests into test_prompt_template_persistence.py Tests verify: - Template applied when using embedding server - Template applied in fallback path - Consistent behavior between both paths * Apply ruff formatting and fix linting issues - Remove unused imports - Fix import ordering - Remove unused variables - Apply code formatting * Fix CI test failures: mock OPENAI_API_KEY in tests Tests were failing in CI because compute_embeddings_openai() checks for OPENAI_API_KEY before using the mocked client. Added monkeypatch to set fake API key in test fixture.	2025-11-14 15:25:17 -08:00
ww26	c3aceed1e0	metadata reveal for ast-chunking; smart detection of seq length in ollama; auto adjust chunk length for ast to prevent silent truncation (#157 ) * feat: enhance token limits with dynamic discovery + AST metadata Improves upon upstream PR #154 with two major enhancements: 1. Hybrid Token Limit Discovery - Dynamic: Query Ollama /api/show for context limits - Fallback: Registry for LM Studio/OpenAI - Zero maintenance for Ollama users - Respects custom num_ctx settings 2. AST Metadata Preservation - create_ast_chunks() returns dict format with metadata - Preserves file_path, file_name, timestamps - Includes astchunk metadata (line numbers, node counts) - Fixes content extraction bug (checks "content" key) - Enables --show-metadata flag 3. Better Token Limits - nomic-embed-text: 2048 tokens (vs 512) - nomic-embed-text-v1.5: 2048 tokens - Added OpenAI models: 8192 tokens 4. Comprehensive Tests - 11 tests for token truncation - 545 new lines in test_astchunk_integration.py - All metadata preservation tests passing * fix: merge EMBEDDING_MODEL_LIMITS and remove redundant validation - Merged upstream's model list with our corrected token limits - Kept our corrected nomic-embed-text: 2048 (not 512) - Removed post-chunking validation (redundant with embedding-time truncation) - All tests passing except 2 pre-existing integration test failures * style: apply ruff formatting and restore PR #154 version handling - Remove duplicate truncate_to_token_limit and get_model_token_limit functions - Restore version handling logic (model:latest -> model) from PR #154 - Restore partial matching fallback for model name variations - Apply ruff formatting to all modified files - All 11 token truncation tests passing * style: sort imports alphabetically (pre-commit auto-fix) * fix: show AST token limit warning only once per session - Add module-level flag to track if warning shown - Prevents spam when processing multiple files - Add clarifying note that auto-truncation happens at embedding time - Addresses issue where warning appeared for every code file * enhance: add detailed logging for token truncation - Track and report truncation statistics (count, tokens removed, max length) - Show first 3 individual truncations with exact token counts - Provide comprehensive summary when truncation occurs - Use WARNING level for data loss visibility - Silent (DEBUG level only) when no truncation needed Replaces misleading "truncated where necessary" message that appeared even when nothing was truncated.	2025-11-08 17:37:31 -08:00
aakash	64b92a04a7	fixing chunking token issues within limit for embedding models	2025-10-31 17:15:00 -07:00
ww26	a85d0ad4a7	Feature/optimize ollama batching (#152 ) * feat: add metadata output to search results - Add --show-metadata flag to display file paths in search results - Preserve document metadata (file_path, file_name, timestamps) during chunking - Update MCP tool schema to support show_metadata parameter - Enhance CLI search output to display metadata when requested - Fix pre-existing bug: args.backend -> args.backend_name Resolves yichuan-w/LEANN#144 * fix: resolve ZMQ linking issues in Python extension - Use pkg_check_modules IMPORTED_TARGET to create PkgConfig::ZMQ - Set PKG_CONFIG_PATH to prioritize ARM64 Homebrew on Apple Silicon - Override macOS -undefined dynamic_lookup to force proper symbol resolution - Use PUBLIC linkage for ZMQ in faiss library for transitive linking - Mark cppzmq includes as SYSTEM to suppress warnings Fixes editable install ZMQ symbol errors while maintaining compatibility across Linux, macOS Intel, and macOS ARM64 platforms. * style: apply ruff formatting * chore: update faiss submodule to use ww2283 fork Use ww2283/faiss fork with fix/zmq-linking branch to resolve CI checkout failures. The ZMQ linking fixes are not yet merged upstream. * feat: implement true batch processing for Ollama embeddings Migrate from deprecated /api/embeddings to modern /api/embed endpoint which supports batch inputs. This reduces HTTP overhead by sending 32 texts per request instead of making individual API calls. Changes: - Update endpoint from /api/embeddings to /api/embed - Change parameter from 'prompt' (single) to 'input' (array) - Update response parsing for batch embeddings array - Increase timeout to 60s for batch processing - Improve error handling for batch requests Performance: - Reduces API calls by 32x (batch size) - Eliminates HTTP connection overhead per text - Note: Ollama still processes batch items sequentially internally Related: #151 * fall back to original faiss as i merge the PR --------- Co-authored-by: yichuan520030910320 <yichuan_wang@berkeley.edu>	2025-10-30 16:39:14 -07:00
CelineNi2	abf312d998	Display context chunks in ask and search results (#149 ) * Printing querying time * Adding source name to chunks Adding source name as metadata to chunks, then printing the sources when searching * Printing the context provided to LLM To check the data transmitted to the LLMs : display the relevance, ID, content, and source of each sent chunk. * Correcting source as metadata for chunks * Applying ruff format * Applying Ruff formatting * Ruff formatting	2025-10-23 15:03:59 -07:00
CelineNi2	6495833887	Changing the option name "--backend" for "--backend-name" as written in the documentation (#146 )	2025-10-14 13:35:10 -07:00
Jon Haddad	0bba4b2157	Add readline support to interactive command line interfaces (#121 ) * Add readline support to interactive command line interfaces - Implement readline history, navigation, and editing for CLI, API, and RAG chat modes - Create shared InteractiveSession class to consolidate readline functionality - Add command history persistence across sessions with separate files per context - Support built-in commands: help, clear, history, quit/exit - Enable arrow key navigation and command editing in all interactive modes * Improvements based on feedback	2025-10-05 17:38:15 -07:00
Andy Lee	ec889f7ef4	Allow 'leann ask' to accept a positional question (#116 )	2025-09-23 21:18:57 -07:00
Andy Lee	db7ba27ff6	feat: Add support for configurable local LLM endpoints (#115 ) * feat: support configurable local llm endpoints * docs	2025-09-23 15:12:13 -07:00
Andy Lee	e93c0dec6f	[Fix] Enable AST chunking when installed (package chunking utils) (#101 ) * fix(core): package chunking utils for AST chunking; re-export in apps; CLI imports packaged utils * style * chore: fix ruff warnings (RUF059, F401) * style	2025-09-17 18:44:00 -07:00
Yichuan Wang	d41e467df9	[CLI] More robust leann list and leann build (#84 ) * chore(submodule): bump faiss to latest storage-efficient build * [chore] add slack to share use case * [cli] better gitignore / better leann list * [cli] fix # 81	2025-09-01 18:36:27 -07:00
Gabriel Dehan	13bb561aad	Add AST-aware code chunking for better code understanding (#58 ) * feat(core): Add AST-aware code chunking with astchunk integration This PR introduces intelligent code chunking that preserves semantic boundaries (functions, classes, methods) for better code understanding in RAG applications. Key Features: - AST-aware chunking for Python, Java, C#, TypeScript files - Graceful fallback to traditional chunking for unsupported languages - New specialized code RAG application for repositories - Enhanced CLI with --use-ast-chunking flag - Comprehensive test suite with integration tests Technical Implementation: - New chunking_utils.py module with enhanced chunking logic - Extended base RAG framework with AST chunking arguments - Updated document RAG with --enable-code-chunking flag - CLI integration with proper error handling and fallback Benefits: - Better semantic understanding of code structure - Improved search quality for code-related queries - Maintains backward compatibility with existing workflows - Supports mixed content (code + documentation) seamlessly Dependencies: - Added astchunk and tree-sitter parsers to pyproject.toml - All dependencies are optional - fallback works without them Testing: - Comprehensive test suite in test_astchunk_integration.py - Integration tests with document RAG - Error handling and edge case coverage Documentation: - Updated README.md with AST chunking highlights - Added ASTCHUNK_INTEGRATION.md with complete guide - Updated features.md with new capabilities * Refactored chunk utils * Remove useless import * Update README.md * Update apps/chunking/utils.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update apps/code_rag.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix issue * apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fixes after pr review * Fix tests not passing * Fix linter error for documentation files * Update .gitignore with unwanted files --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Andy Lee <andylizf@outlook.com>	2025-08-19 23:35:31 -07:00
Andy Lee	03af82d695	fix: leann mcp search cwd & interactive issues (#72 )	2025-08-19 02:27:06 -07:00
yichuan520030910320	37d990d51c	[feature] fix cli	2025-08-18 22:55:43 -07:00
Andy Lee	838ade231e	🔗 Auto-register apps: Universal index discovery (#64 ) * feat: Enhance CLI with improved list and smart remove commands ## ✨ New Features ### 🏠 Enhanced `leann list` command - Better UX: Current project shown first with clear separation - Visual improvements: Icons (🏠/📂), better formatting, size info - Smart guidance: Context-aware usage examples and getting started tips ### 🛡️ Smart `leann remove` command - Safety first: Always shows ALL matching indexes across projects - Intelligent handling: - Single match: Clear location display with cross-project warnings - Multiple matches: Interactive selection with final confirmation - Prevents accidents: No more deleting wrong indexes due to name conflicts - User-friendly: 'c' to cancel, clear visual hierarchy, detailed info ### 🔧 Technical improvements - Clean logging: Hide debug messages for better CLI experience - Comprehensive search: Always scan all projects for transparency - Error handling: Graceful handling of edge cases and user input ## 🎯 Impact - Safer: Eliminates risk of accidental index deletion - Clearer: Users always know what they're operating on - Smarter: Automatic detection and handling of common scenarios 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: vscode ruff, and format --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-16 11:50:25 -07:00
Andy Lee	da6540decd	feat: Enhance CLI with improved list and smart remove commands (#63 ) - Better UX: Current project shown first with clear separation - Visual improvements: Icons (🏠/📂), better formatting, size info - Smart guidance: Context-aware usage examples and getting started tips - Safety first: Always shows ALL matching indexes across projects - Intelligent handling: - Single match: Clear location display with cross-project warnings - Multiple matches: Interactive selection with final confirmation - Prevents accidents: No more deleting wrong indexes due to name conflicts - User-friendly: 'c' to cancel, clear visual hierarchy, detailed info - Clean logging: Hide debug messages for better CLI experience - Comprehensive search: Always scan all projects for transparency - Error handling: Graceful handling of edge cases and user input - Safer: Eliminates risk of accidental index deletion - Clearer: Users always know what they're operating on - Smarter: Automatic detection and handling of common scenarios	2025-08-15 23:49:47 -07:00
Yichuan Wang	2dcfca19ff	style: apply ruff format (#56 )	2025-08-15 17:48:33 -07:00
Andy Lee	db3c63c441	Docs/Core: Low-Resource Setups, SkyPilot Option, and No-Recompute (#45 ) * docs: add SkyPilot template and instructions for running embeddings/index build on cloud GPU * docs: add low-resource note in README; point to config guide; suggest OpenAI embeddings, SkyPilot remote build, and --no-recompute * docs: consolidate low-resource guidance into config guide; README points to it * cli: add --no-recompute and --no-recompute-embeddings flags; docs: clarify HNSW requires --no-compact when disabling recompute * docs: dedupe recomputation guidance; keep single Low-resource setups section * sky: expand leann-build.yaml with configurable params and flags (backend, recompute, compact, embedding options) * hnsw: auto-disable compact when --no-recompute is used; docs: expand SkyPilot with -e overrides and copy-back example * docs+sky: simplify SkyPilot flow (auto-build on launch, rsync copy-back); clarify HNSW auto non-compact when no-recompute * feat: auto compact for hnsw when recompute * reader: non-destructive portability (relative hints + fallback); fix comments; sky: refine yaml * cli: unify flags to --recompute/--no-recompute for build/search/ask; docs: update references * chore: remove * hnsw: move pruned/no-recompute assertion into backend; api: drop global assertion; docs: will adjust after benchmarking * cli: use argparse.BooleanOptionalAction for paired flags (--recompute/--compact) across build/search/ask * docs: a real example on recompute * benchmarks: fix and extend HNSW+DiskANN recompute vs no-recompute; docs: add fresh numbers and DiskANN notes * benchmarks: unify HNSW & DiskANN into one clean script; isolate groups, fixed ports, warm-up, param complexity * docs: diskann recompute * core: auto-cleanup for LeannSearcher/LeannChat (__enter__/__exit__/__del__); ensure server terminate/kill robustness; benchmarks: use searcher.cleanup(); docs: suggest uv run * fix: hang on warnings * docs: boolean flags * docs: leann help	2025-08-15 12:03:19 -07:00
yichuan520030910320	42c8370709	add chunk size in leann build& fix batch size in oai& docs	2025-08-14 13:14:14 -07:00
Yichuan Wang	eab13434ef	feat: support multiple input formats for --docs argument (#39 )	2025-08-12 10:30:31 -07:00
Andy Lee	792ece67dc	ci: add Mac Intel (x86_64) build support (#26 ) * ci: add Mac Intel (x86_64) build support * fix: auto-detect Homebrew path for Intel vs Apple Silicon Macs This fixes the hardcoded /opt/homebrew path which only works on Apple Silicon Macs. Intel Macs use /usr/local as the Homebrew prefix. * fix: auto-detect Homebrew paths for both DiskANN and HNSW backends - Fix DiskANN CMakeLists.txt path reference - Add macOS environment variable detection for OpenMP_ROOT - Support both Intel (/usr/local) and Apple Silicon (/opt/homebrew) paths * fix: improve macOS build reliability with proper OpenMP path detection - Add proper CMAKE_PREFIX_PATH and OpenMP_ROOT detection for both Intel and Apple Silicon Macs - Set LDFLAGS and CPPFLAGS for all Homebrew packages to ensure CMake can find them - Apply CMAKE_ARGS to both HNSW and DiskANN backends for consistent builds - Fix hardcoded paths that caused build failures on Intel Macs (macos-13) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: add abseil library path for protobuf compilation on macOS - Include abseil in CMAKE_PREFIX_PATH for both Intel and Apple Silicon Macs - Add explicit absl_DIR CMake variable to help find abseil for protobuf - Fixes 'absl/log/absl_log.h' file not found error during compilation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: add abseil include path to CPPFLAGS for both Intel and Apple Silicon - Add -I/opt/homebrew/opt/abseil/include to CPPFLAGS for Apple Silicon - Add -I/usr/local/opt/abseil/include to CPPFLAGS for Intel - Fixes 'absl/log/absl_log.h' file not found by ensuring abseil headers are in compiler include path Root cause: CMAKE_PREFIX_PATH alone wasn't sufficient - compiler needs explicit -I flags 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: clean build system and Python 3.9 compatibility Build system improvements: - Simplify macOS environment detection using brew --prefix - Remove complex hardcoded paths and CMAKE_ARGS - Let CMake automatically find Homebrew packages via CMAKE_PREFIX_PATH - Clean separation between Intel (/usr/local) and Apple Silicon (/opt/homebrew) Python 3.9 compatibility: - Set ruff target-version to py39 to match project requirements - Replace str \| None with Union[str, None] in type annotations - Add Union imports where needed - Fix core interface, CLI, chat, and embedding server files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: type * fix: ensure CMAKE_PREFIX_PATH is passed to backend builds - Add CMAKE_ARGS with CMAKE_PREFIX_PATH and OpenMP_ROOT for both HNSW and DiskANN backends - This ensures CMake can find Homebrew packages on both Intel (/usr/local) and Apple Silicon (/opt/homebrew) - Fixes the issue where CMake was still looking for hardcoded paths instead of using detected ones 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: configure CMake paths in pyproject.toml for proper Homebrew detection - Add CMAKE_PREFIX_PATH and OpenMP_ROOT environment variable mapping in both backends - Remove CMAKE_ARGS from GitHub Actions workflow (cleaner separation) - Ensure scikit-build-core correctly uses environment variables for CMake configuration - This should fix the hardcoded /opt/homebrew paths on Intel Macs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: remove hardcoded /opt/homebrew paths from DiskANN CMake - Auto-detect Homebrew libomp path using OpenMP_ROOT environment variable - Fallback to CMAKE_PREFIX_PATH/opt/libomp if OpenMP_ROOT not set - Final fallback to brew --prefix libomp for auto-detection - Maintains backwards compatibility with old hardcoded path - Fixes Intel Mac builds that were failing due to hardcoded Apple Silicon paths 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: update DiskANN submodule with macOS Intel/Apple Silicon compatibility fixes - Auto-detect Homebrew libomp path using OpenMP_ROOT environment variable - Exclude mkl_set_num_threads on macOS (uses Accelerate framework instead of MKL) - Fixes compilation on Intel Macs by using correct /usr/local paths 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: update DiskANN submodule with SIMD function name corrections - Fix _mm128_loadu_ps to _mm_loadu_ps (and similar functions) - This is a known issue in upstream DiskANN code where incorrect function names were used - Resolves compilation errors on macOS Intel builds References: Known DiskANN issue with SIMD intrinsics naming 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: update DiskANN submodule with type cast fix for signed char templates - Add missing type casts (float)a and (float)b in SSE2 version - This matches the existing type casts in the AVX version - Fixes compilation error when instantiating DistanceInnerProduct<int8_t> - Resolves "cannot initialize const float* with const signed char" error 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> fix: update Faiss submodule with override keyword fix - Add missing override keyword to IDSelectorModulo::is_member function - Fixes C++ compilation warning that was treated as error due to -Werror flag - Resolves "warning: 'is_member' overrides a member function but is not marked 'override'" - Improves code conformance to modern C++ best practices 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: update Faiss submodule with override keyword fix * fix: update DiskANN submodule with additional type cast fix - Add missing type cast in DistanceFastL2::norm function SSE2 version - Fixes const float* = const signed char* compilation error - Ensures consistent type casting across all SIMD code paths - Resolves template instantiation error for DistanceFastL2<int8_t> 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * debug: simplify wheel compatibility checking - Fix YAML syntax error in debug step - Use simpler approach to show platform tags and wheel names - This will help identify platform tag compatibility issues 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: use correct Python version for wheel builds - Replace --python python with --python ${{ matrix.python }} - This ensures wheels are built for the correct Python version in each matrix job - Fixes Python version mismatch where cp39 wheels were used in cp311 environments 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: resolve wheel installation conflicts in CI matrix builds Fix issue where multiple Python versions' wheels in the same dist directory caused installation conflicts during CI testing. The problem occurred when matrix builds for different Python versions accumulated wheels in shared directories, and uv pip install would find incompatible wheels. Changes: - Add Python version detection using matrix.python variable - Convert Python version to wheel tag format (e.g., 3.11 -> cp311) - Use find with version-specific pattern matching to select correct wheels - Add explicit error handling if no matching wheel is found This ensures each CI job installs only wheels compatible with its specific Python version, preventing "A path dependency is incompatible with the current platform" errors. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: ensure virtual environment uses correct Python version in CI Fix issue where uv venv was creating virtual environments with a different Python version than specified in the matrix, causing wheel compatibility errors. The problem occurred when the system had multiple Python versions and uv venv defaulted to a different version than intended. Changes: - Add --python ${{ matrix.python }} flag to uv venv command - Ensures virtual environment matches the matrix-specified Python version - Fixes "The wheel is compatible with CPython 3.X but you're using CPython 3.Y" errors This ensures wheel installation selects and installs the correctly built wheels that match the runtime Python version. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: complete Python 3.9 type annotation compatibility fixes Fix remaining Python 3.9 incompatible type annotations throughout the leann-core package that were causing test failures in CI. The union operator (\|) syntax for type hints was introduced in Python 3.10 and causes "TypeError: unsupported operand type(s) for \|" errors in Python 3.9. Changes: - Convert dict[str, Any] \| None to Optional[dict[str, Any]] - Convert int \| None to Optional[int] - Convert subprocess.Popen \| None to Optional[subprocess.Popen] - Convert LeannBackendFactoryInterface \| None to Optional[LeannBackendFactoryInterface] - Add missing Optional imports to all affected files This resolves all test failures related to type annotation syntax and ensures compatibility with Python 3.9 as specified in pyproject.toml. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: complete Python 3.9 type annotation fixes in backend packages Fix remaining Python 3.9 incompatible type annotations in backend packages that were causing test failures. The union operator (\|) syntax for type hints was introduced in Python 3.10 and causes "TypeError: unsupported operand type(s) for \|" errors in Python 3.9. Changes in leann-backend-diskann: - Convert zmq_port: int \| None to Optional[int] in diskann_backend.py - Convert passages_file: str \| None to Optional[str] in diskann_embedding_server.py - Add Optional imports to both files Changes in leann-backend-hnsw: - Convert zmq_port: int \| None to Optional[int] in hnsw_backend.py - Add Optional import This resolves the final test failures related to type annotation syntax and ensures full Python 3.9 compatibility across all packages. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: remove Python 3.10+ zip strict parameter for Python 3.9 compatibility Remove the strict=False parameter from zip() call in api.py as it was introduced in Python 3.10 and causes "TypeError: zip() takes no keyword arguments" in Python 3.9. The strict parameter controls whether zip() raises an exception when the iterables have different lengths. Since we're not relying on this behavior and the code works correctly without it, removing it maintains the same functionality while ensuring Python 3.9 compatibility. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: ensure leann-core package is built on all platforms, not just Ubuntu This fixes the issue where CI was installing leann-core from PyPI instead of using locally built package with Python 3.9 compatibility fixes. * fix: build and install leann meta package on all platforms The leann meta package is pure Python and platform-independent, so there's no reason to restrict it to Ubuntu only. This ensures all platforms use consistent local builds instead of falling back to PyPI versions. * fix: restrict MLX dependencies to Apple Silicon Macs only MLX framework only supports Apple Silicon (ARM64) Macs, not Intel x86_64. Add platform_machine == 'arm64' condition to prevent installation failures on Intel Macs (macos-13). * cleanup: simplify CI configuration - Remove debug step with non-existent 'uv pip debug' command - Simplify wheel installation logic - let uv handle compatibility - Use -e .[test] instead of manually listing all test dependencies * fix: install backend wheels before meta packages Install backend wheels first to ensure they're available when core/meta packages are installed, preventing uv from trying to resolve backend dependencies from PyPI. * fix: use local leann-core when building backend packages Add --find-links to backend builds to ensure they use the locally built leann-core with fixed MLX dependencies instead of downloading from PyPI. Also bump leann-core version to 0.2.8 to ensure clean dependency resolution. * fix: use absolute path for find-links and upgrade backend version - Use GITHUB_WORKSPACE for absolute path to ensure find-links works - Upgrade leann-backend-hnsw to 0.2.8 to match leann-core version * fix: use absolute path for find-links and upgrade backend version - Use GITHUB_WORKSPACE for absolute path to ensure find-links works - Upgrade leann-backend-hnsw to 0.2.8 to match leann-core version * fix: correct version consistency for --find-links to work properly - All packages now use version 0.2.7 consistently - Backend packages can find exact leann-core==0.2.7 from local build - This ensures --find-links works during CI builds instead of falling back to PyPI 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: revert all packages to consistent version 0.2.7 - This PR should not bump versions, only fix Intel Mac build - Version bumps should be done in release_manual workflow - All packages now use 0.2.7 consistently for --find-links to work 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: use --find-links during package installation to avoid PyPI MLX conflicts - Backend wheels contain Requires-Dist: leann-core==0.2.7 - Without --find-links, uv resolves this from PyPI which has MLX for all Darwin - With --find-links, uv uses local leann-core with proper platform restrictions - Root cause: dependency resolution happens at install time, not just build time - Local test confirms this fixes Intel Mac MLX dependency issues 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: restrict MLX dependencies to ARM64 Macs in workspace pyproject.toml - Root pyproject.toml also had MLX dependencies without platform_machine restriction - This caused test dependency installation to fail on Intel Macs - Now consistent with packages/leann-core/pyproject.toml platform restrictions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * chore: cleanup unused files and fix GitHub Actions warnings - Remove unused packages/leann-backend-diskann/CMakeLists.txt (DiskANN uses cmake.source-dir=third_party/DiskANN instead) - Replace macos-latest with macos-14 to avoid migration warnings (macos-latest will migrate to macOS 15 on August 4, 2025) - Keep packages/leann-backend-hnsw/CMakeLists.txt (needed for Faiss config) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: properly handle Python 3.13 support with PyTorch compatibility - Support Python 3.13 on most platforms (Ubuntu, ARM64 Mac) - Exclude Intel Mac + Python 3.13 combination due to PyTorch wheel availability - PyTorch <2.5 supports Intel Mac but not Python 3.13 - PyTorch 2.5+ supports Python 3.13 but not Intel Mac x86_64 - Document limitation in CI configuration comments - Update README badges with detailed Python version support and CI status 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-08-11 16:39:58 -07:00
Andy Lee	2fac0c6fbf	fix: improve gitignore and Jupyter notebook support (#28 ) - Add nbconvert dependency for .ipynb file support - Replace manual gitignore parsing with gitignore-parser library - Proper recursive .gitignore handling (all subdirectories) - Fix compliance with Git gitignore behavior - Simplify code and improve reliability 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-08-10 20:02:46 -07:00
Andy Lee	8b9c2be8c9	Feat/claude code refine (#24 ) * feat: Add Ollama embedding support for local embedding models * docs: Add clear documentation for Ollama embedding usage * fix: remove leann_ask * docs: remove ollama embedding extra instructions * simplify MCP interface for Claude Code - Remove unnecessary search parameters: search_mode, recompute_embeddings, file_types, min_score - Remove leann_clear tool (not needed for Claude Code workflow) - Streamline search to only use: query, index_name, top_k, complexity - Keep core tools: leann_index, leann_search, leann_status, leann_list 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * remove leann_index from MCP interface Users should use CLI command 'leann build' to create indexes first. MCP now only provides search functionality: - leann_search: search existing indexes - leann_status: check index health - leann_list: list available indexes This separates index creation (CLI) from search (Claude Code). 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * improve CLI with auto project name and .gitignore support - Make index_name optional, auto-use current directory name - Read .gitignore patterns and respect them during indexing - Add _read_gitignore_patterns() to parse .gitignore files - Add _should_exclude_file() for pattern matching - Apply exclusion patterns to both PDF and general file processing - Show helpful messages about gitignore usage Now users can simply run: leann build And it will use project name + respect .gitignore patterns. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-08-09 20:37:17 -07:00
Andy Lee	3ff5aac8e0	Add Ollama embedding support to enable local embedding models (#22 ) * feat: Add Ollama embedding support for local embedding models * docs: Add clear documentation for Ollama embedding usage * feat: Enhance Ollama embedding with better error handling and concurrent processing - Add intelligent model validation and suggestions (inspired by OllamaChat) - Implement concurrent processing for better performance - Add retry mechanism with timeout handling - Provide user-friendly error messages with emojis - Auto-detect and recommend embedding models - Add text truncation for long texts - Improve progress bar display logic * docs: don't mention it in README	2025-08-08 18:44:07 -07:00
yichuan520030910320	e4bcc76f88	fix cli & make recompute default true	2025-08-07 18:58:04 -07:00
yichuan520030910320	710e83b1fd	fix cli if there is no other type of doc to make it robust	2025-08-07 18:46:05 -07:00
yichuan520030910320	c96d653072	more support for type of docs in cli	2025-08-07 18:14:03 -07:00
Andy Lee	8b22d2b5d3	Merge pull request #19 from yichuan-w/feature/claude-code-research Feature/claude code research	2025-08-05 23:02:34 -07:00
yichuan520030910320	f94ce63d51	add gpt oss! serve your RAG using ollama	2025-08-05 16:49:52 -07:00
Andy Lee	b3e9ee96fa	fix: resolve all ruff linting errors and add lint CI check - Fix ambiguous fullwidth characters (commas, parentheses) in strings and comments - Replace Chinese comments with English equivalents - Fix unused imports with proper noqa annotations for intentional imports - Fix bare except clauses with specific exception types - Fix redefined variables and undefined names - Add ruff noqa annotations for generated protobuf files - Add lint and format check to GitHub Actions CI pipeline	2025-07-26 22:38:13 -07:00
yichuan520030910320	cdb92f7cf4	update pytoml version && fix colab env && fix pdf extract in pip	2025-07-26 16:33:13 -07:00
Andy Lee	71e5f1774c	docs: cli	2025-07-21 23:48:40 -07:00
Andy Lee	1b6272ce0e	Building, CLI tool & Embedding Server Fixed (#5 ) * chore: shorter build time * chore: update faiss * fix: no longger do embedding server reuse * fix: do not reuse emb_server and close it properly * feat: cli tool * feat: cli more args * fix: same embedding logic	2025-07-21 20:17:25 -07:00

36 Commits