Andy Lee
8c988cf98b
refactor: improve test structure and fix main_cli example
...
- Move pytest configuration from pytest.ini to pyproject.toml
- Remove unnecessary run_tests.py script (use test extras instead)
- Fix main_cli_example.py to properly use command line arguments for LLM config
- Add test_readme_examples.py to test code examples from README
- Refactor tests to use pytest fixtures and parametrization
- Update test documentation to reflect new structure
- Set proper environment variables in CI for test execution
2025-07-28 14:25:48 -07:00
Andy Lee
41812c7d22
feat: add --use-existing-index option to google_history_reader_leann.py
...
- Allow using existing index without rebuilding
- Useful for testing pre-built indices
2025-07-28 00:36:57 -07:00
Andy Lee
2047a1a128
feat: add OpenAI embeddings support to google_history_reader_leann.py
...
- Add --embedding-model and --embedding-mode arguments
- Support automatic detection of normalized embeddings
- Works correctly with cosine distance for OpenAI embeddings
2025-07-27 23:10:20 -07:00
Andy Lee
9a5c197acd
fix: auto-detect normalized embeddings and use cosine distance
...
- Add automatic detection for normalized embedding models (OpenAI, Voyage AI, Cohere)
- Automatically set distance_metric='cosine' for normalized embeddings
- Add warnings when using non-optimal distance metrics
- Implement manual L2 normalization in HNSW backend (custom Faiss build lacks normalize_L2)
- Fix DiskANN zmq_port compatibility with lazy loading strategy
- Add documentation for normalized embeddings feature
This fixes the low accuracy issue when using OpenAI text-embedding-3-small model with default MIPS metric.
2025-07-27 20:21:05 -07:00
yichuan520030910320
af1790395a
fix ruff errors and formatting
2025-07-27 02:22:54 -07:00
Andy Lee
b3e9ee96fa
fix: resolve all ruff linting errors and add lint CI check
...
- Fix ambiguous fullwidth characters (commas, parentheses) in strings and comments
- Replace Chinese comments with English equivalents
- Fix unused imports with proper noqa annotations for intentional imports
- Fix bare except clauses with specific exception types
- Fix redefined variables and undefined names
- Add ruff noqa annotations for generated protobuf files
- Add lint and format check to GitHub Actions CI pipeline
2025-07-26 22:38:13 -07:00
yichuan520030910320
8537a6b17e
default args change
2025-07-26 21:51:14 -07:00
yichuan520030910320
cdb92f7cf4
update pytoml version && fix colab env && fix pdf extract in pip
2025-07-26 16:33:13 -07:00
yichuan520030910320
170f7644e9
simplify readme
2025-07-25 02:11:02 -07:00
yichuan520030910320
52153bbb69
update faiss compare
2025-07-25 01:45:50 -07:00
yichuan520030910320
b6d43f5fd9
add gif
2025-07-25 00:12:35 -07:00
yichuan520030910320
de252fef31
[chat] update 30s example
2025-07-24 14:40:33 -07:00
yichuan520030910320
c083bda5b7
fix several bug
2025-07-23 18:17:11 -07:00
yichuan520030910320
851f0f04c3
fix some para
2025-07-23 01:46:34 -07:00
yichuan520030910320
0544f96b79
default main cli to openai add data dict as a args
2025-07-22 21:56:30 -07:00
yichuan520030910320
2ebb29de65
default main cli to openai
2025-07-22 21:55:18 -07:00
yichuan520030910320
aa9a14a917
make the email wonderful format
2025-07-22 21:41:58 -07:00
yichuan520030910320
9efcc6d95c
Merge branch 'main' of https://github.com/yichuan-w/LEANN
2025-07-22 20:44:02 -07:00
yichuan520030910320
f3f5d91207
make the google history wonderful format
2025-07-22 20:43:56 -07:00
Andy Lee
43155d2811
fix: supress resources leak logs
2025-07-22 19:53:45 -07:00
yichuan520030910320
90120d4dff
upd the structure in the chat for better perf
2025-07-22 17:00:56 -07:00
Andy Lee
b3970793cf
fix: cache the loaded model
2025-07-21 21:20:53 -07:00
yichuan520030910320
530f6e4af5
add progress bar in build
2025-07-21 20:55:18 -07:00
yichuan520030910320
32364320f8
update wechat and we should fix the bug introduced in 1c5fec5
2025-07-21 16:22:16 -07:00
yichuan520030910320
83b7ea5a59
change wecaht app split logic& merge
2025-07-19 19:44:33 -07:00
yichuan520030910320
0796a52df1
change wecaht app split logic
2025-07-19 19:43:30 -07:00
Andy Lee
85b7ba0168
feat: allow build from existed embeddings
2025-07-19 01:27:37 -07:00
yichuan520030910320
aec2291f04
add embedding api
2025-07-17 22:29:31 -07:00
yichuan520030910320
335ae003ac
add data
2025-07-17 22:29:03 -07:00
yichuan520030910320
99d439577d
Merge branch 'main' of github.com:yichuan520030910320/LEANN-RAG
2025-07-17 18:15:27 -07:00
yichuan520030910320
4f83086788
update readme and auto find email
2025-07-17 18:15:17 -07:00
Andy Lee
a13c527e39
feat: openai embeddings
2025-07-17 17:02:47 -07:00
yichuan520030910320
90d9f27383
update readme and main example
2025-07-17 15:03:22 -07:00
yichuan520030910320
0db81c16cd
update readme and chrome example
2025-07-17 12:58:11 -07:00
yichuan520030910320
51255bdffa
update readme and add timer
2025-07-16 17:15:51 -07:00
Andy Lee
f77c4e38cb
perf: reuse embedding server for query embed
2025-07-16 16:12:15 -07:00
Andy Lee
e595bbb5fb
feat: hint for users about wrong model name
2025-07-15 22:40:40 -07:00
Andy Lee
125c1f6f25
fix: model name
2025-07-15 21:48:45 -07:00
yichuan520030910320
dec3ee85fd
fix main cli
2025-07-15 21:19:16 -07:00
yichuan520030910320
326783f7f1
fix mem compare fix split
2025-07-14 23:07:46 -07:00
yichuan520030910320
e5a9ca8787
fix mem compare
2025-07-14 22:55:10 -07:00
Andy Lee
f2feccdbd0
fix: mem compare
2025-07-14 16:35:08 -07:00
Andy Lee
b89e56e9c2
fix: file name
2025-07-14 15:34:56 -07:00
Andy Lee
ef01d6997a
fix: faiss only
2025-07-14 13:15:55 -07:00
Andy Lee
8b4654921b
fix: run faiss in subprocess to prevent kmp
2025-07-14 00:29:21 -07:00
Andy Lee
711fb4a775
feat: compare faiss
2025-07-13 22:44:16 -07:00
Andy Lee
3b5a185e60
refactor: check if current emb_server has correct passages/embedder
2025-07-13 22:43:51 -07:00
yichuan520030910320
b8e5728e6a
fix wechat application
2025-07-13 22:29:54 -07:00
yichuan520030910320
d038319d8b
upd readme wechat application
2025-07-13 22:00:49 -07:00
yichuan520030910320
c611d0f30f
upd readme mail application
2025-07-13 21:48:57 -07:00