refactor: reorgnize all examples/ and test/

2025-08-03 22:37:45 -07:00
parent 58556ef44c
commit b0239b6e4d
41 changed files with 127 additions and 1926 deletions
--- a/README.md
+++ b/README.md
@@ -216,11 +216,11 @@ Ask questions directly about your personal PDFs, documents, and any directory co
  <img src="videos/paper_clear.gif" alt="LEANN Document Search Demo" width="600">
 </p>

-The example below asks a question about summarizing our paper (uses default data in `examples/data`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a README in Chinese) and this is the **easiest example** to run here:
+The example below asks a question about summarizing our paper (uses default data in `data/`, which is a directory with diverse data sources: two papers, Pride and Prejudice, and a README in Chinese) and this is the **easiest example** to run here:

 ```bash
 source .venv/bin/activate # Don't forget to activate the virtual environment
-python ./examples/document_rag.py --query "What are the main techniques LEANN explores?"
+python ./apps/document_rag.py --query "What are the main techniques LEANN explores?"
 ```

 <details>
@@ -228,17 +228,17 @@ python ./examples/document_rag.py --query "What are the main techniques LEANN ex

 #### Parameters
 ```bash
--data-dir DIR           # Directory containing documents to process (default: examples/data)
+--data-dir DIR           # Directory containing documents to process (default: data)
 --file-types .ext .ext   # Filter by specific file types (optional - all LlamaIndex supported types if omitted)
 ```

 #### Example Commands
 ```bash
 # Process all documents with larger chunks for academic papers
-python examples/document_rag.py --data-dir "~/Documents/Papers" --chunk-size 1024
+python apps/document_rag.py --data-dir "~/Documents/Papers" --chunk-size 1024

 # Filter only markdown and Python files with smaller chunks
-python examples/document_rag.py --data-dir "./docs" --chunk-size 256 --file-types .md .py
+python apps/document_rag.py --data-dir "./docs" --chunk-size 256 --file-types .md .py
 ```

 </details>
@@ -255,7 +255,7 @@ python examples/document_rag.py --data-dir "./docs" --chunk-size 256 --file-type
 Before running the example below, you need to grant full disk access to your terminal/VS Code in System Preferences → Privacy & Security → Full Disk Access.

 ```bash
-python examples/email_rag.py --query "What's the food I ordered by DoorDash or Uber Eats mostly?"
+python apps/email_rag.py --query "What's the food I ordered by DoorDash or Uber Eats mostly?"
 ```
 **780K email chunks → 78MB storage.** Finally, search your email like you search Google.

@@ -271,10 +271,10 @@ python examples/email_rag.py --query "What's the food I ordered by DoorDash or U
 #### Example Commands
 ```bash
 # Search work emails from a specific account
-python examples/email_rag.py --mail-path "~/Library/Mail/V10/WORK_ACCOUNT"
+python apps/email_rag.py --mail-path "~/Library/Mail/V10/WORK_ACCOUNT"

 # Find all receipts and order confirmations (includes HTML)
-python examples/email_rag.py --query "receipt order confirmation invoice" --include-html
+python apps/email_rag.py --query "receipt order confirmation invoice" --include-html
 ```

 </details>
@@ -295,7 +295,7 @@ Once the index is built, you can ask questions like:
 </p>

 ```bash
-python examples/browser_rag.py --query "Tell me my browser history about machine learning?"
+python apps/browser_rag.py --query "Tell me my browser history about machine learning?"
 ```
 **38K browser entries → 6MB storage.** Your browser history becomes your personal search engine.

@@ -310,10 +310,10 @@ python examples/browser_rag.py --query "Tell me my browser history about machine
 #### Example Commands
 ```bash
 # Search academic research from your browsing history
-python examples/browser_rag.py --query "arxiv papers machine learning transformer architecture"
+python apps/browser_rag.py --query "arxiv papers machine learning transformer architecture"

 # Track competitor analysis across work profile
-python examples/browser_rag.py --chrome-profile "~/Library/Application Support/Google/Chrome/Work Profile" --max-items 5000
+python apps/browser_rag.py --chrome-profile "~/Library/Application Support/Google/Chrome/Work Profile" --max-items 5000
 ```

 </details>
@@ -353,7 +353,7 @@ Once the index is built, you can ask questions like:
 </p>

 ```bash
-python examples/wechat_rag.py --query "Show me all group chats about weekend plans"
+python apps/wechat_rag.py --query "Show me all group chats about weekend plans"
 ```
 **400K messages → 64MB storage** Search years of chat history in any language.

@@ -394,10 +394,10 @@ sudo packages/wechat-exporter/wechattweak-cli install
 #### Example Commands
 ```bash
 # Search for travel plans discussed in group chats
-python examples/wechat_rag.py --query "travel plans" --max-items 10000
+python apps/wechat_rag.py --query "travel plans" --max-items 10000

 # Re-export and search recent chats (useful after new messages)
-python examples/wechat_rag.py --force-export --query "work schedule"
+python apps/wechat_rag.py --force-export --query "work schedule"
 ```

 </details>
@@ -519,7 +519,7 @@ Options:
 ## Benchmarks


-**[Simple Example: Compare LEANN vs FAISS →](examples/compare_faiss_vs_leann.py)**
+**[Simple Example: Compare LEANN vs FAISS →](benchmarks/compare_faiss_vs_leann.py)**
 ### 📊 Storage Comparison

 | System | DPR (2.1M) | Wiki (60M) | Chat (400K) | Email (780K) | Browser (38K) |
@@ -534,8 +534,8 @@ Options:

 ```bash
 uv pip install -e ".[dev]"  # Install dev dependencies
-python examples/run_evaluation.py data/indices/dpr/dpr_diskann      # DPR dataset
-python examples/run_evaluation.py data/indices/rpj_wiki/rpj_wiki.index  # Wikipedia
+python benchmarks/run_evaluation.py data/indices/dpr/dpr_diskann      # DPR dataset
+python benchmarks/run_evaluation.py data/indices/rpj_wiki/rpj_wiki.index  # Wikipedia
 ```

 The evaluation script downloads data automatically on first run. The last three results were tested with partial personal data, and you can reproduce them with your own data!