Commit Graph

8 Commits

Author SHA1 Message Date
yichuan-w
07afe546ea reproduce docvqa results 2025-11-14 10:22:42 +00:00
yichuan-w
ae3b8af3df update vidore 2025-11-14 07:31:24 +00:00
yichuan-w
a9c014df9e Add timing instrumentation and multi-dataset support for multi-vector retrieval
- Add timing measurements for search operations (load and core time)
- Increase embedding batch size from 1 to 32 for better performance
- Add explicit memory cleanup with del all_embeddings
- Support loading and merging multiple datasets with different splits
- Add CLI arguments for search method selection (ann/exact/exact-all)
- Auto-detect image field names across different dataset structures
- Print candidate doc counts for performance monitoring

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 21:13:17 +00:00
yichuan-w
3766ad1fd2 robust multi-vector 2025-11-09 02:34:53 +00:00
yichuan-w
dc6c9f696e update some search in copali 2025-11-08 08:53:03 +00:00
yichuan520030910320
01475c10a0 add img 2025-09-23 23:25:05 -07:00
yichuan520030910320
576beb13db add doc about multimodal 2025-09-23 23:21:03 -07:00
Yichuan Wang
edde0cdeb2 [Feat] ColQwen intergration (#111)
* add colqwen stuff

* add colqwen stuff and pass ruff

* remove ipynb
2025-09-23 17:51:29 -07:00