- Fix import sorting and unused imports - Update type annotations to use built-in types (list, dict) instead of typing.List/Dict - Fix trailing whitespace and end-of-file issues - Fix Chinese fullwidth comma to regular comma - Update test_main_cli.py to test_document_rag.py - Add backward compatibility test for main_cli_example.py - Pass all pre-commit hooks (ruff, ruff-format, etc.)
2.5 KiB
2.5 KiB
Parameter Consistency Guide
This document ensures that the new unified interface maintains exact parameter compatibility with the original examples.
Parameter Mapping
Common Parameters (All Examples)
| Parameter | Default Value | Notes |
|---|---|---|
backend_name |
"hnsw" |
All examples use HNSW backend |
graph_degree |
32 |
Consistent across all |
complexity |
64 |
Consistent across all |
is_compact |
True |
NOT compact_index |
is_recompute |
True |
NOT use_recomputed_embeddings |
num_threads |
1 |
Force single-threaded mode |
chunk_size |
256 |
Consistent across all |
Example-Specific Defaults
document_rag.py (replaces main_cli_example.py)
index_dir:"./test_doc_files"(matches original)chunk_overlap:128(matches original)embedding_model:"facebook/contriever"embedding_mode:"sentence-transformers"- No max limit by default
email_rag.py (replaces mail_reader_leann.py)
index_dir:"./mail_index"(matches original)max_items:1000(wasmax_emails)chunk_overlap:25(matches original)embedding_model:"facebook/contriever"- NO
embedding_modeparameter in LeannBuilder (original doesn't have it)
browser_rag.py (replaces google_history_reader_leann.py)
index_dir:"./google_history_index"(matches original)max_items:1000(wasmax_entries)chunk_overlap:25(primary value in original)embedding_model:"facebook/contriever"embedding_mode:"sentence-transformers"
wechat_rag.py (replaces wechat_history_reader_leann.py)
index_dir:"./wechat_history_magic_test_11Debug_new"(matches original)max_items:50(wasmax_entries, much lower default)chunk_overlap:25(matches original)embedding_model:"Qwen/Qwen3-Embedding-0.6B"(special model for Chinese)- NO
embedding_modeparameter in LeannBuilder (original doesn't have it)
Implementation Notes
-
Parameter Names: The original files use
is_compactandis_recompute, not the newer names. -
Chunk Overlap: Most examples use
25except for documents which uses128. -
Embedding Mode: Only
google_history_reader_leann.pyandmain_cli_example.pyhave this parameter. -
Max Items: Each example has different defaults:
- Email/Browser: 1000
- WeChat: 50
- Documents: unlimited
-
Special Cases:
- WeChat uses a specific Chinese embedding model
- Email reader includes HTML processing option