LEANN/examples/document_rag.py at baf70dc411bde4eac7e51673a89d06783fdfc482

Files

Andy Lee 274bbb19ea feat: Add chunk-size parameters and improve file type filtering

- Add --chunk-size and --chunk-overlap parameters to all RAG examples
- Preserve original default values for each data source:
  - Document: 256/128 (optimized for general documents)
  - Email: 256/25 (smaller overlap for email threads)
  - Browser: 256/128 (standard for web content)
  - WeChat: 192/64 (smaller chunks for chat messages)
- Make --file-types optional filter instead of restriction in document_rag
- Update README to clarify interactive mode and parameter usage
- Fix LLM default model documentation (gpt-4o, not gpt-4o-mini)

2025-07-29 18:31:56 -07:00

3.5 KiB

Raw Blame History

View Raw

3.5 KiB Raw Blame History

3.5 KiB

Raw Blame History