docs: Improve configuration guide based on feedback

- List specific files in default data/ directory (2 AI papers, literature, tech report)
- Update examples to use English and better RAG-suitable queries
- Change full dataset reference to use --max-items -1
- Adjust small model guidance about upgrading to larger models when time allows
- Update top-k defaults to reflect actual default of 20
- Ensure consistent use of full model name Qwen/Qwen3-Embedding-0.6B
- Reorder optimization steps, move MLX to third position
- Remove incorrect chunk size tuning guidance
- Change README from 'Having trouble' to 'Need best practices'
This commit is contained in:
Andy Lee
2025-08-04 19:29:17 -07:00
parent 00f506c0bd
commit d9b6f195c5
2 changed files with 17 additions and 23 deletions

View File

@@ -170,7 +170,7 @@ ollama pull llama3.2:1b
LEANN provides flexible parameters for embedding models, search strategies, and data processing to fit your specific needs.
📚 **Having trouble with configuration?** Check our [Configuration Guide](docs/configuration-guide.md) for detailed optimization tips, model selection advice, and solutions to common issues like slow embeddings or poor search quality.
📚 **Need configuration best practices?** Check our [Configuration Guide](docs/configuration-guide.md) for detailed optimization tips, model selection advice, and solutions to common issues like slow embeddings or poor search quality.
<details>
<summary><strong>📋 Click to expand: Common Parameters (Available in All Examples)</strong></summary>