Files
LEANN/examples/mcp_integration_demo.py
Aakash Suresh b4bb8dec75 feat: Add MCP integration support for Slack and Twitter (#134)
* feat: Add MCP integration support for Slack and Twitter

- Implement SlackMCPReader for connecting to Slack MCP servers
- Implement TwitterMCPReader for connecting to Twitter MCP servers
- Add SlackRAG and TwitterRAG applications with full CLI support
- Support live data fetching via Model Context Protocol (MCP)
- Add comprehensive documentation and usage examples
- Include connection testing capabilities with --test-connection flag
- Add standalone tests for core functionality
- Update README with detailed MCP integration guide
- Add Aakash Suresh to Active Contributors

Resolves #36

* fix: Resolve linting issues in MCP integration

- Replace deprecated typing.Dict/List with built-in dict/list
- Fix boolean comparisons (== True/False) to direct checks
- Remove unused variables in demo script
- Update type annotations to use modern Python syntax

All pre-commit hooks should now pass.

* fix: Apply final formatting fixes for pre-commit hooks

- Remove unused imports (asyncio, pathlib.Path)
- Remove unused class imports in demo script
- Ensure all files pass ruff format and pre-commit checks

This should resolve all remaining CI linting issues.

* fix: Apply pre-commit formatting changes

- Fix trailing whitespace in all files
- Apply ruff formatting to match project standards
- Ensure consistent code style across all MCP integration files

This commit applies the exact changes that pre-commit hooks expect.

* fix: Apply pre-commit hooks formatting fixes

- Remove trailing whitespace from all files
- Fix ruff formatting issues (2 errors resolved)
- Apply consistent code formatting across 3 files
- Ensure all files pass pre-commit validation

This resolves all CI formatting failures.

* fix: Update MCP RAG classes to match BaseRAGExample signature

- Fix SlackMCPRAG and TwitterMCPRAG __init__ methods to provide required parameters
- Add name, description, and default_index_name to super().__init__ calls
- Resolves test failures: test_slack_rag_initialization and test_twitter_rag_initialization

This fixes the TypeError caused by BaseRAGExample requiring additional parameters.

* style: Apply ruff formatting - add trailing commas

- Add trailing commas to super().__init__ calls in SlackMCPRAG and TwitterMCPRAG
- Fixes ruff format pre-commit hook requirements

* fix: Resolve SentenceTransformer model_kwargs parameter conflict

- Fix local_files_only parameter conflict in embedding_compute.py
- Create separate copies of model_kwargs and tokenizer_kwargs for local vs network loading
- Prevents parameter conflicts when falling back from local to network loading
- Resolves TypeError in test_readme_examples.py tests

This addresses the SentenceTransformer initialization issues in CI tests.

* fix: Add comprehensive SentenceTransformer version compatibility

- Handle both old and new sentence-transformers versions
- Gracefully fallback from advanced parameters to basic initialization
- Catch TypeError for model_kwargs/tokenizer_kwargs and use basic SentenceTransformer init
- Ensures compatibility across different CI environments and local setups
- Maintains optimization benefits where supported while ensuring broad compatibility

This resolves test failures in CI environments with older sentence-transformers versions.

* style: Apply ruff formatting to embedding_compute.py

- Break long logger.warning lines for better readability
- Fixes pre-commit hook formatting requirements

* docs: Comprehensive documentation improvements for better user experience

- Add clear step-by-step Getting Started Guide for new users
- Add comprehensive CLI Reference with all commands and options
- Improve installation instructions with clear steps and verification
- Add detailed troubleshooting section for common issues (Ollama, OpenAI, etc.)
- Clarify difference between CLI commands and specialized apps
- Add environment variables documentation
- Improve MCP integration documentation with CLI integration examples
- Address user feedback about confusing installation and setup process

This resolves documentation gaps that made LEANN difficult for non-specialists to use.

* style: Remove trailing whitespace from README.md

- Fix trailing whitespace issues found by pre-commit hooks
- Ensures consistent formatting across documentation

* docs: Simplify README by removing excessive documentation

- Remove overly complex CLI reference and getting started sections (lines 61-334)
- Remove emojis from section headers for cleaner appearance
- Keep README simple and focused as requested
- Maintain essential MCP integration documentation

This addresses feedback to keep documentation minimal and avoid auto-generated content.

* docs: Address maintainer feedback on README improvements

- Restore emojis in section headers (Prerequisites and Quick Install)
- Add MCP live data feature mention in line 23 with links to Slack and Twitter
- Add detailed API credential setup instructions for Slack:
  - Step-by-step Slack App creation process
  - Required OAuth scopes and permissions
  - Clear token identification (xoxb- vs xapp-)
- Add detailed API credential setup instructions for Twitter:
  - Twitter Developer Account application process
  - API v2 requirements for bookmarks access
  - Required permissions and scopes

This addresses maintainer feedback to make API setup more user-friendly.
2025-10-07 02:18:32 -07:00

179 lines
6.1 KiB
Python

#!/usr/bin/env python3
"""
MCP Integration Examples for LEANN
This script demonstrates how to use LEANN with different MCP servers for
RAG on various platforms like Slack and Twitter.
Examples:
1. Slack message RAG via MCP
2. Twitter bookmark RAG via MCP
3. Testing MCP server connections
"""
import asyncio
import sys
from pathlib import Path
# Add the parent directory to the path so we can import from apps
sys.path.append(str(Path(__file__).parent.parent))
async def demo_slack_mcp():
"""Demonstrate Slack MCP integration."""
print("=" * 60)
print("🔥 Slack MCP RAG Demo")
print("=" * 60)
print("\n1. Testing Slack MCP server connection...")
# This would typically use a real MCP server command
# For demo purposes, we show what the command would look like
# slack_app = SlackMCPRAG() # Would be used for actual testing
# Simulate command line arguments for testing
class MockArgs:
mcp_server = "slack-mcp-server" # This would be the actual MCP server command
workspace_name = "my-workspace"
channels = ["general", "random", "dev-team"]
no_concatenate_conversations = False
max_messages_per_channel = 50
test_connection = True
print(f"MCP Server Command: {MockArgs.mcp_server}")
print(f"Workspace: {MockArgs.workspace_name}")
print(f"Channels: {', '.join(MockArgs.channels)}")
# In a real scenario, you would run:
# success = await slack_app.test_mcp_connection(MockArgs)
print("\n📝 Example usage:")
print("python -m apps.slack_rag \\")
print(" --mcp-server 'slack-mcp-server' \\")
print(" --workspace-name 'my-team' \\")
print(" --channels general dev-team \\")
print(" --test-connection")
print("\n🔍 After indexing, you could query:")
print("- 'What did the team discuss about the project deadline?'")
print("- 'Find messages about the new feature launch'")
print("- 'Show me conversations about budget planning'")
async def demo_twitter_mcp():
"""Demonstrate Twitter MCP integration."""
print("\n" + "=" * 60)
print("🐦 Twitter MCP RAG Demo")
print("=" * 60)
print("\n1. Testing Twitter MCP server connection...")
# twitter_app = TwitterMCPRAG() # Would be used for actual testing
class MockArgs:
mcp_server = "twitter-mcp-server"
username = None # Fetch all bookmarks
max_bookmarks = 500
no_tweet_content = False
no_metadata = False
test_connection = True
print(f"MCP Server Command: {MockArgs.mcp_server}")
print(f"Max Bookmarks: {MockArgs.max_bookmarks}")
print(f"Include Content: {not MockArgs.no_tweet_content}")
print(f"Include Metadata: {not MockArgs.no_metadata}")
print("\n📝 Example usage:")
print("python -m apps.twitter_rag \\")
print(" --mcp-server 'twitter-mcp-server' \\")
print(" --max-bookmarks 1000 \\")
print(" --test-connection")
print("\n🔍 After indexing, you could query:")
print("- 'What AI articles did I bookmark last month?'")
print("- 'Find tweets about machine learning techniques'")
print("- 'Show me bookmarked threads about startup advice'")
async def show_mcp_server_setup():
"""Show how to set up MCP servers."""
print("\n" + "=" * 60)
print("⚙️ MCP Server Setup Guide")
print("=" * 60)
print("\n🔧 Setting up Slack MCP Server:")
print("1. Install a Slack MCP server (example commands):")
print(" npm install -g slack-mcp-server")
print(" # OR")
print(" pip install slack-mcp-server")
print("\n2. Configure Slack credentials:")
print(" export SLACK_BOT_TOKEN='xoxb-your-bot-token'")
print(" export SLACK_APP_TOKEN='xapp-your-app-token'")
print("\n3. Test the server:")
print(" slack-mcp-server --help")
print("\n🔧 Setting up Twitter MCP Server:")
print("1. Install a Twitter MCP server:")
print(" npm install -g twitter-mcp-server")
print(" # OR")
print(" pip install twitter-mcp-server")
print("\n2. Configure Twitter API credentials:")
print(" export TWITTER_API_KEY='your-api-key'")
print(" export TWITTER_API_SECRET='your-api-secret'")
print(" export TWITTER_ACCESS_TOKEN='your-access-token'")
print(" export TWITTER_ACCESS_TOKEN_SECRET='your-access-token-secret'")
print("\n3. Test the server:")
print(" twitter-mcp-server --help")
async def show_integration_benefits():
"""Show the benefits of MCP integration."""
print("\n" + "=" * 60)
print("🌟 Benefits of MCP Integration")
print("=" * 60)
benefits = [
("🔄 Live Data Access", "Fetch real-time data from platforms without manual exports"),
("🔌 Standardized Protocol", "Use any MCP-compatible server with minimal code changes"),
("🚀 Easy Extension", "Add new platforms by implementing MCP readers"),
("🔒 Secure Access", "MCP servers handle authentication and API management"),
("📊 Rich Metadata", "Access full platform metadata (timestamps, engagement, etc.)"),
("⚡ Efficient Processing", "Stream data directly into LEANN without intermediate files"),
]
for title, description in benefits:
print(f"\n{title}")
print(f" {description}")
async def main():
"""Main demo function."""
print("🎯 LEANN MCP Integration Examples")
print("This demo shows how to integrate LEANN with MCP servers for various platforms.")
await demo_slack_mcp()
await demo_twitter_mcp()
await show_mcp_server_setup()
await show_integration_benefits()
print("\n" + "=" * 60)
print("✨ Next Steps")
print("=" * 60)
print("1. Install and configure MCP servers for your platforms")
print("2. Test connections using --test-connection flag")
print("3. Run indexing to build your RAG knowledge base")
print("4. Start querying your personal data!")
print("\n📚 For more information:")
print("- Check the README for detailed setup instructions")
print("- Look at the apps/slack_rag.py and apps/twitter_rag.py for implementation details")
print("- Explore other MCP servers for additional platforms")
if __name__ == "__main__":
asyncio.run(main())