fix: address root cause of test hanging - improper ZMQ/C++ resource cleanup
Fixed the actual root cause instead of just masking it in tests: 1. Root Problem: - C++ side's ZmqDistanceComputer creates ZMQ connections but doesn't clean them - Python 3.9/3.13 are more sensitive to cleanup timing during shutdown 2. Core Fixes in SearcherBase and LeannSearcher: - Added cleanup() method to BaseSearcher that cleans ZMQ and embedding server - LeannSearcher.cleanup() now also handles ZMQ context cleanup - Both HNSW and DiskANN searchers now properly delete C++ index objects 3. Backend-Specific Cleanup: - HNSWSearcher.cleanup(): Deletes self.index to trigger C++ destructors - DiskannSearcher.cleanup(): Deletes self._index and resets state - Both force garbage collection after deletion 4. Test Infrastructure: - Added auto_cleanup_searcher fixture for explicit resource management - Global cleanup now more aggressive with ZMQ context destruction This is the proper fix - cleaning up resources at the source, not just working around the issue in tests. The hanging was caused by C++ side ZMQ connections not being properly terminated when is_recompute=True.
This commit is contained in:
@@ -18,13 +18,37 @@ def global_test_cleanup() -> Generator:
|
||||
yield
|
||||
|
||||
# Cleanup after all tests
|
||||
print("\n🧹 Running global test cleanup...")
|
||||
|
||||
# 1. Force cleanup of any LeannSearcher instances
|
||||
try:
|
||||
import gc
|
||||
|
||||
# Force garbage collection to trigger __del__ methods
|
||||
gc.collect()
|
||||
time.sleep(0.2)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# 2. Terminate ZMQ contexts more aggressively
|
||||
try:
|
||||
import zmq
|
||||
|
||||
# Set a very short linger on any remaining contexts
|
||||
# This prevents blocking on context termination
|
||||
# Get the global instance and destroy it
|
||||
ctx = zmq.Context.instance()
|
||||
ctx.linger = 0
|
||||
|
||||
# Force termination - this is aggressive but needed for CI
|
||||
try:
|
||||
ctx.destroy(linger=0)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Also try to terminate the default context
|
||||
try:
|
||||
zmq.Context.term(zmq.Context.instance())
|
||||
except Exception:
|
||||
pass
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
@@ -78,6 +102,32 @@ def global_test_cleanup() -> Generator:
|
||||
pass
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def auto_cleanup_searcher():
|
||||
"""Fixture that automatically cleans up LeannSearcher instances."""
|
||||
searchers = []
|
||||
|
||||
def register(searcher):
|
||||
"""Register a searcher for cleanup."""
|
||||
searchers.append(searcher)
|
||||
return searcher
|
||||
|
||||
yield register
|
||||
|
||||
# Cleanup all registered searchers
|
||||
for searcher in searchers:
|
||||
try:
|
||||
searcher.cleanup()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Force garbage collection
|
||||
import gc
|
||||
|
||||
gc.collect()
|
||||
time.sleep(0.1)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def cleanup_after_each_test():
|
||||
"""Cleanup after each test to prevent resource leaks."""
|
||||
|
||||
Reference in New Issue
Block a user