Commit Graph

6 Commits

Author SHA1 Message Date
Yichuan Wang
76cc798e3e Feat/multi vector timing and dataset improvements (#181)
* Add timing instrumentation and multi-dataset support for multi-vector retrieval

- Add timing measurements for search operations (load and core time)
- Increase embedding batch size from 1 to 32 for better performance
- Add explicit memory cleanup with del all_embeddings
- Support loading and merging multiple datasets with different splits
- Add CLI arguments for search method selection (ann/exact/exact-all)
- Auto-detect image field names across different dataset structures
- Print candidate doc counts for performance monitoring

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* update vidore

* reproduce docvqa results

* reproduce docvqa results and add debug file

* fix: format colqwen_forward.py to pass pre-commit checks

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-03 01:10:49 -08:00
Yichuan Wang
d599566fd7 Revert "[Multi-vector]Add timing instrumentation and multi-dataset support fo…" (#180)
This reverts commit 00770aebbb.
2025-12-03 01:09:39 -08:00
Yichuan Wang
00770aebbb [Multi-vector]Add timing instrumentation and multi-dataset support for multi-vector… (#161)
* Add timing instrumentation and multi-dataset support for multi-vector retrieval

- Add timing measurements for search operations (load and core time)
- Increase embedding batch size from 1 to 32 for better performance
- Add explicit memory cleanup with del all_embeddings
- Support loading and merging multiple datasets with different splits
- Add CLI arguments for search method selection (ann/exact/exact-all)
- Auto-detect image field names across different dataset structures
- Print candidate doc counts for performance monitoring

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* update vidore

* reproduce docvqa results

* reproduce docvqa results and add debug file

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-03 00:55:42 -08:00
yichuan-w
3766ad1fd2 robust multi-vector 2025-11-09 02:34:53 +00:00
yichuan-w
dc6c9f696e update some search in copali 2025-11-08 08:53:03 +00:00
Yichuan Wang
edde0cdeb2 [Feat] ColQwen intergration (#111)
* add colqwen stuff

* add colqwen stuff and pass ruff

* remove ipynb
2025-09-23 17:51:29 -07:00