Files
LEANN/tests
Andy Lee a35bfb0354 fix: comprehensive ZMQ timeout and cleanup fixes based on detailed analysis
Based on excellent diagnostic suggestions, implemented multiple fixes:

1. Diagnostics:
   - Added faulthandler to dump stack traces 10s before CI timeout
   - Enhanced CI script with trap handler to show processes/network on timeout
   - Added diag() function to capture pstree, processes, network listeners

2. ZMQ Socket Timeouts (critical fix):
   - Added RCVTIMEO=1000ms and SNDTIMEO=1000ms to all client sockets
   - Added IMMEDIATE=1 to avoid connection blocking
   - Reduced searcher timeout from 30s to 5s
   - This prevents infinite blocking on recv/send operations

3. Context.instance() Fix (major issue):
   - NEVER call term() or destroy() on Context.instance()
   - This was causing blocking as it waits for ALL sockets to close
   - Now only set linger=0 without terminating

4. Enhanced Process Cleanup:
   - Added _reap_children fixture for aggressive session-end cleanup
   - Better recursive child process termination
   - Added final wait to ensure cleanup completes

The 180s timeout was happening because:
- ZMQ recv() was blocking indefinitely without timeout
- Context.instance().term() was waiting for all sockets
- Child processes weren't being fully cleaned up

These changes should prevent the hanging completely.
2025-08-08 18:29:09 -07:00
..
2025-08-06 21:59:51 -07:00

LEANN Tests

This directory contains automated tests for the LEANN project using pytest.

Test Files

test_readme_examples.py

Tests the examples shown in README.md:

  • The basic example code that users see first (parametrized for both HNSW and DiskANN backends)
  • Import statements work correctly
  • Different backend options (HNSW, DiskANN)
  • Different LLM configuration options (parametrized for both backends)
  • All main README examples are tested with both HNSW and DiskANN backends using pytest parametrization

test_basic.py

Basic functionality tests that verify:

  • All packages can be imported correctly
  • C++ extensions (FAISS, DiskANN) load properly
  • Basic index building and searching works for both HNSW and DiskANN backends
  • Uses parametrized tests to test both backends

test_document_rag.py

Tests the document RAG example functionality:

  • Tests with facebook/contriever embeddings
  • Tests with OpenAI embeddings (if API key is available)
  • Tests error handling with invalid parameters
  • Verifies that normalized embeddings are detected and cosine distance is used

test_diskann_partition.py

Tests DiskANN graph partitioning functionality:

  • Tests DiskANN index building without partitioning (baseline)
  • Tests automatic graph partitioning with is_recompute=True
  • Verifies that partition files are created and large files are cleaned up for storage saving
  • Tests search functionality with partitioned indices
  • Validates medoid and max_base_norm file generation and usage
  • Includes performance comparison between DiskANN (with partition) and HNSW
  • Note: These tests are skipped in CI due to hardware requirements and computation time

Running Tests

Install test dependencies:

# Using extras
uv pip install -e ".[test]"

Run all tests:

pytest tests/

# Or with coverage
pytest tests/ --cov=leann --cov-report=html

# Run in parallel (faster)
pytest tests/ -n auto

Run specific tests:

# Only basic tests
pytest tests/test_basic.py

# Only tests that don't require OpenAI
pytest tests/ -m "not openai"

# Skip slow tests
pytest tests/ -m "not slow"

# Run DiskANN partition tests (requires local machine, not CI)
pytest tests/test_diskann_partition.py

Run with specific backend:

# Test only HNSW backend
pytest tests/test_basic.py::test_backend_basic[hnsw]
pytest tests/test_readme_examples.py::test_readme_basic_example[hnsw]

# Test only DiskANN backend
pytest tests/test_basic.py::test_backend_basic[diskann]
pytest tests/test_readme_examples.py::test_readme_basic_example[diskann]

# All DiskANN tests (parametrized + specialized partition tests)
pytest tests/ -k diskann

CI/CD Integration

Tests are automatically run in GitHub Actions:

  1. After building wheel packages
  2. On multiple Python versions (3.9 - 3.13)
  3. On both Ubuntu and macOS
  4. Using pytest with appropriate markers and flags

pytest.ini Configuration

The pytest.ini file configures:

  • Test discovery paths
  • Default timeout (600 seconds)
  • Environment variables (HF_HUB_DISABLE_SYMLINKS, TOKENIZERS_PARALLELISM)
  • Custom markers for slow and OpenAI tests
  • Verbose output with short tracebacks

Known Issues

  • OpenAI tests are automatically skipped if no API key is provided