docs: Simplify README by removing excessive documentation

- Remove overly complex CLI reference and getting started sections (lines 61-334) - Remove emojis from section headers for cleaner appearance - Keep README simple and focused as requested - Maintain essential MCP integration documentation This addresses feedback to keep documentation minimal and avoid auto-generated content.
2025-10-06 16:00:22 -07:00
parent a43fafe44e
commit dfae37d0ee
1 changed files with 18 additions and 275 deletions
--- a/README.md
+++ b/README.md
@@ -48,7 +48,7 @@ LEANN achieves this through *graph-based selective recomputation* with *high-deg

 ## Installation

-### 📦 Prerequisites: Install uv
+### Prerequisites: Install uv

 [Install uv](https://docs.astral.sh/uv/getting-started/installation/#installation-methods) first if you don't have it. Typically, you can install it with:

@@ -56,280 +56,23 @@ LEANN achieves this through *graph-based selective recomputation* with *high-deg
 curl -LsSf https://astral.sh/uv/install.sh | sh
 ```

-### 🚀 Quick Install (Recommended for Most Users)
+### Quick Install
+
+Clone the repository to access all examples and try amazing applications,

-**Step 1: Clone and Setup**
 ```bash
 git clone https://github.com/yichuan-w/LEANN.git leann
 cd leann
+```
+
+and install LEANN from [PyPI](https://pypi.org/project/leann/) to run them immediately:
+
+```bash
 uv venv
-source .venv/bin/activate  # On Windows: .venv\Scripts\activate
-```
-
-**Step 2: Install LEANN**
-```bash
+source .venv/bin/activate
 uv pip install leann
 ```

-**Step 3: Verify Installation**
-```bash
-leann --help
-```
-
-You should see the LEANN CLI help message. If you get an error, see [Troubleshooting](#-troubleshooting) below.
-
-### 🌐 Global Installation (For MCP Integration)
-
-To use LEANN with MCP servers (like Claude Code integration), install globally:
-
-```bash
-# Install globally using uv tool
-uv tool install leann-core --with leann
-
-# Verify global installation
-leann --help
-```
-
-> **When to use global installation:** Required for MCP integration, Claude Code, and when you want to use `leann` commands from any directory.
-## 📖 CLI Reference
-
-LEANN provides a simple but powerful command-line interface. Here are the essential commands:
-
-### 🔨 Building Indexes
-
-**Basic Usage:**
-```bash
-leann build <index-name> --docs <files-or-directories>
-```
-
-**Examples:**
-```bash
-# Index a single directory
-leann build my-docs --docs ./documents
-
-# Index multiple directories
-leann build my-project --docs ./src ./tests ./docs
-
-# Index specific files and directories
-leann build my-files --docs ./README.md ./src/ ./config.json
-
-# Index only specific file types
-leann build my-pdfs --docs ./documents --file-types .pdf,.docx
-
-# Use different embedding models
-leann build my-docs --docs ./documents --embedding-model sentence-transformers/all-MiniLM-L6-v2
-```
-
-### 🔍 Searching and Querying
-
-**Search (returns ranked results):**
-```bash
-leann search <index-name> "your search query"
-```
-
-**Ask (conversational Q&A):**
-```bash
-leann ask <index-name> "your question"
-```
-
-**Examples:**
-```bash
-# Search for documents
-leann search my-docs "machine learning algorithms"
-
-# Ask questions about your data
-leann ask my-code "How does the authentication system work?"
-
-# Interactive mode (keeps asking questions)
-leann ask my-docs --interactive
-```
-
-### 📋 Index Management
-
-```bash
-# List all indexes
-leann list
-
-# Remove an index
-leann remove my-docs
-
-# Get index information
-leann info my-docs
-```
-
-### ⚙️ Configuration Options
-
-**Embedding Models:**
-```bash
-# Use different embedding backends
--embedding-mode sentence-transformers  # Default, runs locally
--embedding-mode openai                 # Requires OPENAI_API_KEY
--embedding-mode ollama                 # Requires Ollama server
--embedding-mode mlx                    # Apple Silicon only
-
-# Specify embedding model
--embedding-model sentence-transformers/all-MiniLM-L6-v2  # Fast, 384-dim
--embedding-model sentence-transformers/all-mpnet-base-v2  # Better quality, 768-dim
--embedding-model text-embedding-ada-002                   # OpenAI (requires API key)
-```
-
-**Vector Database Backends:**
-```bash
--backend hnsw     # Default, good for most use cases
--backend diskann  # Better for large datasets (>1M documents)
-```
-
-**File Processing:**
-```bash
--file-types .pdf,.docx,.txt     # Only process specific file types
--chunk-size 512                 # Adjust text chunk size (default: 256)
--chunk-overlap 128              # Adjust chunk overlap (default: 128)
-```
-
-### 🌐 Environment Variables
-
-Configure LEANN behavior with environment variables:
-
-```bash
-# OpenAI Configuration
-export OPENAI_API_KEY="your-api-key"
-export OPENAI_BASE_URL="https://api.openai.com/v1"  # Custom endpoint
-
-# Ollama Configuration
-export OLLAMA_HOST="http://localhost:11434"  # Default Ollama URL
-export OLLAMA_HOST="http://your-server:11434"  # Custom Ollama server
-
-# LEANN Configuration
-export LEANN_LOG_LEVEL="INFO"  # DEBUG, INFO, WARNING, ERROR
-```
-
-### 🔧 Troubleshooting
-
-**Common Issues:**
-
-1. **"leann: command not found"**
-   ```bash
-   # Make sure you're in the right environment
-   source .venv/bin/activate
-
-   # Or install globally
-   uv tool install leann-core --with leann
-   ```
-
-2. **Ollama connection issues**
-   ```bash
-   # Check if Ollama is running
-   curl http://localhost:11434/api/tags
-
-   # Set custom Ollama URL
-   export OLLAMA_HOST="http://your-ollama-server:11434"
-   leann build my-docs --docs ./documents --embedding-mode ollama
-   ```
-
-3. **OpenAI API errors**
-   ```bash
-   # Set your API key
-   export OPENAI_API_KEY="your-api-key"
-
-   # Use custom endpoint (e.g., Azure OpenAI)
-   export OPENAI_BASE_URL="https://your-endpoint.openai.azure.com/v1"
-   ```
-
-4. **Memory issues with large datasets**
-   ```bash
-   # Use smaller batch sizes
-   leann build my-docs --docs ./documents --batch-size 16
-
-   # Use DiskANN for large datasets
-   leann build my-docs --docs ./documents --backend diskann
-   ```
-
-## 🚀 Getting Started Guide
-
-**New to LEANN?** Follow this step-by-step guide to get up and running quickly.
-
-### Step 1: Choose Your Installation Method
-
-**For most users (recommended):**
-```bash
-# Quick setup - works for 90% of use cases
-git clone https://github.com/yichuan-w/LEANN.git leann
-cd leann
-uv venv && source .venv/bin/activate
-uv pip install leann
-```
-
-**For MCP integration (Claude Code, live data sources):**
-```bash
-# Global installation required for MCP servers
-uv tool install leann-core --with leann
-```
-
-### Step 2: Verify Installation
-
-```bash
-leann --help
-```
-
-If you see the help message, you're ready to go! If not, see [Troubleshooting](#-troubleshooting) above.
-
-### Step 3: Create Your First Index
-
-**Simple example:**
-```bash
-# Create a test directory with some documents
-mkdir test-docs
-echo "LEANN is a vector database for personal AI" > test-docs/about.txt
-echo "It uses 97% less storage than traditional solutions" > test-docs/features.txt
-
-# Build your first index
-leann build my-first-index --docs test-docs
-
-# Search it
-leann search my-first-index "vector database"
-```
-
-### Step 4: Try Real Data
-
-**Index your documents:**
-```bash
-leann build my-docs --docs ~/Documents
-leann search my-docs "your search query"
-```
-
-**Index your code:**
-```bash
-leann build my-code --docs ./src ./tests
-leann ask my-code "How does authentication work?"
-```
-
-### Step 5: Explore Advanced Features
-
-Once you're comfortable with the basics:
-
- **Try different embedding models**: Add `--embedding-model sentence-transformers/all-MiniLM-L6-v2`
- **Use Ollama for local LLMs**: Set up Ollama and use `--embedding-mode ollama`
- **Connect live data**: Try MCP integration for Slack, Twitter, etc.
- **Explore specialized apps**: Use `python -m apps.email_rag`, `python -m apps.browser_rag`, etc.
-
-### Understanding LEANN vs Apps
-
-**LEANN has two interfaces:**
-
-1. **CLI Commands** (`leann build`, `leann search`, `leann ask`)
-   - General-purpose document indexing and search
-   - Works with any files and directories
-   - Best for: Personal documents, code, general use
-
-2. **Specialized Apps** (`python -m apps.email_rag`, `python -m apps.chatgpt_rag`, etc.)
-   - Pre-built applications for specific data sources
-   - Handle data extraction and formatting automatically
-   - Best for: Email, browser history, chat exports, live data
-
-**When to use which:**
- Use **CLI** for general documents and code
- Use **Apps** for specialized data sources (email, chats, etc.)
-
 <!--
 > Low-resource? See "Low-resource setups" in the [Configuration Guide](docs/configuration-guide.md#low-resource-setups). -->

@@ -1032,18 +775,18 @@ Once your iMessage conversations are indexed, you can search with queries like:

 </details>

-### 🔌 MCP Integration: RAG on Live Data from Any Platform!
+### MCP Integration: RAG on Live Data from Any Platform

 **NEW!** Connect to live data sources through the Model Context Protocol (MCP). LEANN now supports real-time RAG on platforms like Slack, Twitter, and more through standardized MCP servers.

 **Key Benefits:**
- 🔄 **Live Data Access**: Fetch real-time data without manual exports
- 🔌 **Standardized Protocol**: Use any MCP-compatible server
- 🚀 **Easy Extension**: Add new platforms with minimal code
- 🔒 **Secure Access**: MCP servers handle authentication
+- **Live Data Access**: Fetch real-time data without manual exports
+- **Standardized Protocol**: Use any MCP-compatible server
+- **Easy Extension**: Add new platforms with minimal code
+- **Secure Access**: MCP servers handle authentication

 <details>
-<summary><strong>💬 Slack Messages: Search Your Team Conversations</strong></summary>
+<summary><strong>Slack Messages: Search Your Team Conversations</strong></summary>

 Transform your Slack workspace into a searchable knowledge base! Find discussions, decisions, and shared knowledge across all your channels.

@@ -1078,7 +821,7 @@ python -m apps.slack_rag \
 </details>

 <details>
-<summary><strong>🐦 Twitter Bookmarks: Your Personal Tweet Library</strong></summary>
+<summary><strong>Twitter Bookmarks: Your Personal Tweet Library</strong></summary>

 Search through your Twitter bookmarks! Find that perfect article, thread, or insight you saved for later.

@@ -1204,7 +947,7 @@ Try our fully agentic pipeline with auto query rewriting, semantic search planni

 **🔥 Ready to supercharge your coding?** [Complete Setup Guide →](packages/leann-mcp/README.md)

-## 🖥️ Command Line Interface
+## Command Line Interface

 LEANN includes a powerful CLI for document processing and search. Perfect for quick document indexing and interactive chat.