# ๐Ÿงช Leann Sanity Checks This directory contains comprehensive sanity checks for the Leann system, ensuring all components work correctly across different configurations. ## ๐Ÿ“ Test Files ### `test_distance_functions.py` Tests all supported distance functions across DiskANN backend: - โœ… **MIPS** (Maximum Inner Product Search) - โœ… **L2** (Euclidean Distance) - โœ… **Cosine** (Cosine Similarity) ```bash uv run python tests/sanity_checks/test_distance_functions.py ``` ### `test_l2_verification.py` Specifically verifies that L2 distance is correctly implemented by: - Building indices with L2 vs Cosine metrics - Comparing search results and score ranges - Validating that different metrics produce expected score patterns ```bash uv run python tests/sanity_checks/test_l2_verification.py ``` ### `test_sanity_check.py` Comprehensive end-to-end verification including: - Distance function testing - Embedding model compatibility - Search result correctness validation - Backend integration testing ```bash uv run python tests/sanity_checks/test_sanity_check.py ``` ## ๐ŸŽฏ What These Tests Verify ### โœ… Distance Function Support - All three distance metrics (MIPS, L2, Cosine) work correctly - Score ranges are appropriate for each metric type - Different metrics can produce different rankings (as expected) ### โœ… Backend Integration - DiskANN backend properly initializes and builds indices - Graph construction completes without errors - Search operations return valid results ### โœ… Embedding Pipeline - Real-time embedding computation works - Multiple embedding models are supported - ZMQ server communication functions correctly ### โœ… End-to-End Functionality - Index building โ†’ searching โ†’ result retrieval pipeline - Metadata preservation through the entire flow - Error handling and graceful degradation ## ๐Ÿ” Expected Output When all tests pass, you should see: ``` ๐Ÿ“Š ๆต‹่ฏ•็ป“ๆžœๆ€ป็ป“: mips : โœ… ้€š่ฟ‡ l2 : โœ… ้€š่ฟ‡ cosine : โœ… ้€š่ฟ‡ ๐ŸŽ‰ ๆต‹่ฏ•ๅฎŒๆˆ! ``` ## ๐Ÿ› Troubleshooting ### Common Issues **Import Errors**: Ensure you're running from the project root: ```bash cd /path/to/leann uv run python tests/sanity_checks/test_distance_functions.py ``` **Memory Issues**: Reduce graph complexity for resource-constrained systems: ```python builder = LeannBuilder( backend_name="diskann", graph_degree=8, # Reduced from 16 complexity=16 # Reduced from 32 ) ``` **ZMQ Port Conflicts**: The tests use different ports to avoid conflicts, but you may need to kill existing processes: ```bash pkill -f "embedding_server" ``` ## ๐Ÿ“Š Performance Expectations ### Typical Timing (3 documents, consumer hardware): - **Index Building**: 2-5 seconds per distance function - **Search Query**: 50-200ms - **Recompute Mode**: 5-15 seconds (higher accuracy) ### Memory Usage: - **Index Storage**: ~1-2 MB per distance function - **Runtime Memory**: ~500MB (including model loading) ## ๐Ÿ”— Integration with CI/CD These tests are designed to be run in automated environments: ```yaml # GitHub Actions example - name: Run Sanity Checks run: | uv run python tests/sanity_checks/test_distance_functions.py uv run python tests/sanity_checks/test_l2_verification.py ``` The tests are deterministic and should produce consistent results across different platforms.