Files
LEANN/tests/test_mcp_standalone.py
Aakash Suresh b4bb8dec75 feat: Add MCP integration support for Slack and Twitter (#134)
* feat: Add MCP integration support for Slack and Twitter

- Implement SlackMCPReader for connecting to Slack MCP servers
- Implement TwitterMCPReader for connecting to Twitter MCP servers
- Add SlackRAG and TwitterRAG applications with full CLI support
- Support live data fetching via Model Context Protocol (MCP)
- Add comprehensive documentation and usage examples
- Include connection testing capabilities with --test-connection flag
- Add standalone tests for core functionality
- Update README with detailed MCP integration guide
- Add Aakash Suresh to Active Contributors

Resolves #36

* fix: Resolve linting issues in MCP integration

- Replace deprecated typing.Dict/List with built-in dict/list
- Fix boolean comparisons (== True/False) to direct checks
- Remove unused variables in demo script
- Update type annotations to use modern Python syntax

All pre-commit hooks should now pass.

* fix: Apply final formatting fixes for pre-commit hooks

- Remove unused imports (asyncio, pathlib.Path)
- Remove unused class imports in demo script
- Ensure all files pass ruff format and pre-commit checks

This should resolve all remaining CI linting issues.

* fix: Apply pre-commit formatting changes

- Fix trailing whitespace in all files
- Apply ruff formatting to match project standards
- Ensure consistent code style across all MCP integration files

This commit applies the exact changes that pre-commit hooks expect.

* fix: Apply pre-commit hooks formatting fixes

- Remove trailing whitespace from all files
- Fix ruff formatting issues (2 errors resolved)
- Apply consistent code formatting across 3 files
- Ensure all files pass pre-commit validation

This resolves all CI formatting failures.

* fix: Update MCP RAG classes to match BaseRAGExample signature

- Fix SlackMCPRAG and TwitterMCPRAG __init__ methods to provide required parameters
- Add name, description, and default_index_name to super().__init__ calls
- Resolves test failures: test_slack_rag_initialization and test_twitter_rag_initialization

This fixes the TypeError caused by BaseRAGExample requiring additional parameters.

* style: Apply ruff formatting - add trailing commas

- Add trailing commas to super().__init__ calls in SlackMCPRAG and TwitterMCPRAG
- Fixes ruff format pre-commit hook requirements

* fix: Resolve SentenceTransformer model_kwargs parameter conflict

- Fix local_files_only parameter conflict in embedding_compute.py
- Create separate copies of model_kwargs and tokenizer_kwargs for local vs network loading
- Prevents parameter conflicts when falling back from local to network loading
- Resolves TypeError in test_readme_examples.py tests

This addresses the SentenceTransformer initialization issues in CI tests.

* fix: Add comprehensive SentenceTransformer version compatibility

- Handle both old and new sentence-transformers versions
- Gracefully fallback from advanced parameters to basic initialization
- Catch TypeError for model_kwargs/tokenizer_kwargs and use basic SentenceTransformer init
- Ensures compatibility across different CI environments and local setups
- Maintains optimization benefits where supported while ensuring broad compatibility

This resolves test failures in CI environments with older sentence-transformers versions.

* style: Apply ruff formatting to embedding_compute.py

- Break long logger.warning lines for better readability
- Fixes pre-commit hook formatting requirements

* docs: Comprehensive documentation improvements for better user experience

- Add clear step-by-step Getting Started Guide for new users
- Add comprehensive CLI Reference with all commands and options
- Improve installation instructions with clear steps and verification
- Add detailed troubleshooting section for common issues (Ollama, OpenAI, etc.)
- Clarify difference between CLI commands and specialized apps
- Add environment variables documentation
- Improve MCP integration documentation with CLI integration examples
- Address user feedback about confusing installation and setup process

This resolves documentation gaps that made LEANN difficult for non-specialists to use.

* style: Remove trailing whitespace from README.md

- Fix trailing whitespace issues found by pre-commit hooks
- Ensures consistent formatting across documentation

* docs: Simplify README by removing excessive documentation

- Remove overly complex CLI reference and getting started sections (lines 61-334)
- Remove emojis from section headers for cleaner appearance
- Keep README simple and focused as requested
- Maintain essential MCP integration documentation

This addresses feedback to keep documentation minimal and avoid auto-generated content.

* docs: Address maintainer feedback on README improvements

- Restore emojis in section headers (Prerequisites and Quick Install)
- Add MCP live data feature mention in line 23 with links to Slack and Twitter
- Add detailed API credential setup instructions for Slack:
  - Step-by-step Slack App creation process
  - Required OAuth scopes and permissions
  - Clear token identification (xoxb- vs xapp-)
- Add detailed API credential setup instructions for Twitter:
  - Twitter Developer Account application process
  - API v2 requirements for bookmarks access
  - Required permissions and scopes

This addresses maintainer feedback to make API setup more user-friendly.
2025-10-07 02:18:32 -07:00

222 lines
7.5 KiB
Python

#!/usr/bin/env python3
"""
Standalone test script for MCP integration implementations.
This script tests the basic functionality of the MCP readers
without requiring LEANN core dependencies.
"""
import json
import sys
from pathlib import Path
# Add the parent directory to the path so we can import from apps
sys.path.append(str(Path(__file__).parent.parent))
def test_slack_reader_basic():
"""Test basic SlackMCPReader functionality without async operations."""
print("Testing SlackMCPReader basic functionality...")
# Import and test initialization
from apps.slack_data.slack_mcp_reader import SlackMCPReader
reader = SlackMCPReader("slack-mcp-server")
assert reader.mcp_server_command == "slack-mcp-server"
assert reader.concatenate_conversations
# Test message formatting
message = {
"text": "Hello team! How's the project going?",
"user": "john_doe",
"channel": "general",
"ts": "1234567890.123456",
}
formatted = reader._format_message(message)
assert "Channel: #general" in formatted
assert "User: john_doe" in formatted
assert "Message: Hello team!" in formatted
# Test concatenated content creation
messages = [
{"text": "First message", "user": "alice", "ts": "1000"},
{"text": "Second message", "user": "bob", "ts": "2000"},
]
content = reader._create_concatenated_content(messages, "dev-team")
assert "Slack Channel: #dev-team" in content
assert "Message Count: 2" in content
assert "First message" in content
assert "Second message" in content
print("✅ SlackMCPReader basic tests passed")
def test_twitter_reader_basic():
"""Test basic TwitterMCPReader functionality."""
print("Testing TwitterMCPReader basic functionality...")
from apps.twitter_data.twitter_mcp_reader import TwitterMCPReader
reader = TwitterMCPReader("twitter-mcp-server")
assert reader.mcp_server_command == "twitter-mcp-server"
assert reader.include_tweet_content
assert reader.max_bookmarks == 1000
# Test bookmark formatting
bookmark = {
"text": "Amazing article about the future of AI! Must read for everyone interested in tech.",
"author": "tech_guru",
"created_at": "2024-01-15T14:30:00Z",
"url": "https://twitter.com/tech_guru/status/123456789",
"likes": 156,
"retweets": 42,
"replies": 23,
"hashtags": ["AI", "tech", "future"],
"mentions": ["@openai", "@anthropic"],
}
formatted = reader._format_bookmark(bookmark)
assert "=== Twitter Bookmark ===" in formatted
assert "Author: @tech_guru" in formatted
assert "Amazing article about the future of AI!" in formatted
assert "Likes: 156" in formatted
assert "Retweets: 42" in formatted
assert "Hashtags: AI, tech, future" in formatted
assert "Mentions: @openai, @anthropic" in formatted
# Test with minimal data
simple_bookmark = {"text": "Short tweet", "author": "user123"}
formatted_simple = reader._format_bookmark(simple_bookmark)
assert "=== Twitter Bookmark ===" in formatted_simple
assert "Short tweet" in formatted_simple
assert "Author: @user123" in formatted_simple
print("✅ TwitterMCPReader basic tests passed")
def test_mcp_request_format():
"""Test MCP request formatting."""
print("Testing MCP request formatting...")
# Test initialization request format
init_request = {
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": {"name": "leann-slack-reader", "version": "1.0.0"},
},
}
# Verify it's valid JSON
json_str = json.dumps(init_request)
parsed = json.loads(json_str)
assert parsed["jsonrpc"] == "2.0"
assert parsed["method"] == "initialize"
assert parsed["params"]["protocolVersion"] == "2024-11-05"
# Test tools/list request
list_request = {"jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {}}
json_str = json.dumps(list_request)
parsed = json.loads(json_str)
assert parsed["method"] == "tools/list"
print("✅ MCP request formatting tests passed")
def test_data_processing():
"""Test data processing capabilities."""
print("Testing data processing capabilities...")
from apps.slack_data.slack_mcp_reader import SlackMCPReader
from apps.twitter_data.twitter_mcp_reader import TwitterMCPReader
# Test Slack message processing with various formats
slack_reader = SlackMCPReader("test-server")
messages_with_timestamps = [
{"text": "Meeting in 5 minutes", "user": "alice", "ts": "1000.123"},
{"text": "On my way!", "user": "bob", "ts": "1001.456"},
{"text": "Starting now", "user": "charlie", "ts": "1002.789"},
]
content = slack_reader._create_concatenated_content(messages_with_timestamps, "meetings")
assert "Meeting in 5 minutes" in content
assert "On my way!" in content
assert "Starting now" in content
# Test Twitter bookmark processing with engagement data
twitter_reader = TwitterMCPReader("test-server", include_metadata=True)
high_engagement_bookmark = {
"text": "Thread about startup lessons learned 🧵",
"author": "startup_founder",
"likes": 1250,
"retweets": 340,
"replies": 89,
}
formatted = twitter_reader._format_bookmark(high_engagement_bookmark)
assert "Thread about startup lessons learned" in formatted
assert "Likes: 1250" in formatted
assert "Retweets: 340" in formatted
assert "Replies: 89" in formatted
# Test with metadata disabled
twitter_reader_no_meta = TwitterMCPReader("test-server", include_metadata=False)
formatted_no_meta = twitter_reader_no_meta._format_bookmark(high_engagement_bookmark)
assert "Thread about startup lessons learned" in formatted_no_meta
assert "Likes:" not in formatted_no_meta
assert "Retweets:" not in formatted_no_meta
print("✅ Data processing tests passed")
def main():
"""Run all standalone tests."""
print("🧪 Running MCP Integration Standalone Tests")
print("=" * 60)
print("Testing core functionality without LEANN dependencies...")
print()
try:
test_slack_reader_basic()
test_twitter_reader_basic()
test_mcp_request_format()
test_data_processing()
print("\n" + "=" * 60)
print("🎉 All standalone tests passed!")
print("\n✨ MCP Integration Summary:")
print("- SlackMCPReader: Ready for Slack message processing")
print("- TwitterMCPReader: Ready for Twitter bookmark processing")
print("- MCP Protocol: Properly formatted JSON-RPC requests")
print("- Data Processing: Handles various message/bookmark formats")
print("\n🚀 Next Steps:")
print("1. Install MCP servers: npm install -g slack-mcp-server twitter-mcp-server")
print("2. Configure API credentials for Slack and Twitter")
print("3. Test connections: python -m apps.slack_rag --test-connection")
print("4. Start indexing live data from your platforms!")
print("\n📖 Documentation:")
print("- Check README.md for detailed setup instructions")
print("- Run examples/mcp_integration_demo.py for usage examples")
print("- Explore apps/slack_rag.py and apps/twitter_rag.py for implementation details")
except Exception as e:
print(f"\n❌ Test failed: {e}")
import traceback
traceback.print_exc()
sys.exit(1)
if __name__ == "__main__":
main()