Files
Andy Lee fecee94af1 Experiments (#68)
* feat: finance bench

* docs: results

* chore: ignroe data README

* feat: fix financebench

* feat: laion, also required idmaps support

* style: format

* style: format

* fix: resolve ruff linting errors

- Remove unused variables in benchmark scripts
- Rename unused loop variables to follow convention

* feat: enron email bench

* experiments for running DiskANN & BM25 on Arch 4090

* style: format

* chore(ci): remove paru-bin submodule and config to fix checkout --recurse-submodules

* docs: data

* docs: data updated

* fix: as package

* fix(ci): only run pre-commit

* chore: use http url of astchunk; use group for some dev deps

* fix(ci): should checkout modules as well since `uv sync` checks

* fix(ci): run with lint only

* fix: find links to install wheels available

* CI: force local wheels in uv install step

* CI: install local wheels via file paths

* CI: pick wheels matching current Python tag

* CI: handle python tag mismatches for local wheels

* CI: use matrix python venv and set macOS deployment target

* CI: revert install step to match main

* CI: use uv group install with local wheel selection

* CI: rely on setup-uv for Python and tighten group install

* CI: install build deps with uv python interpreter

* CI: use temporary uv venv for build deps

* CI: add build venv scripts path for wheel repair
2025-09-24 11:19:04 -07:00

5.0 KiB

🤝 Contributing

We welcome contributions! Leann is built by the community, for the community.

Ways to Contribute

  • 🐛 Bug Reports: Found an issue? Let us know!
  • 💡 Feature Requests: Have an idea? We'd love to hear it!
  • 🔧 Code Contributions: PRs welcome for all skill levels
  • 📖 Documentation: Help make Leann more accessible
  • 🧪 Benchmarks: Share your performance results

🚀 Development Setup

Prerequisites

  1. Install uv (fast Python package installer):

    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Clone the repository:

    git clone https://github.com/LEANN-RAG/LEANN-RAG.git
    cd LEANN-RAG
    
  3. Install system dependencies:

    macOS:

    brew install llvm libomp boost protobuf zeromq pkgconf
    

    Ubuntu/Debian:

    sudo apt-get install libomp-dev libboost-all-dev protobuf-compiler \
                         libabsl-dev libmkl-full-dev libaio-dev libzmq3-dev
    
  4. Build from source:

    # macOS
    CC=$(brew --prefix llvm)/bin/clang CXX=$(brew --prefix llvm)/bin/clang++ uv sync
    
    # Ubuntu/Debian
    uv sync
    

🔨 Pre-commit Hooks

We use pre-commit hooks to ensure code quality and consistency. This runs automatically before each commit.

Setup Pre-commit

  1. Install pre-commit tools:

    uv sync lint
    
  2. Install the git hooks:

    pre-commit install
    
  3. Run pre-commit manually (optional):

    uv run pre-commit run --all-files
    

Pre-commit Checks

Our pre-commit configuration includes:

  • Trailing whitespace removal
  • End-of-file fixing
  • YAML validation
  • Large file prevention
  • Merge conflict detection
  • Debug statement detection
  • Code formatting with ruff
  • Code linting with ruff

🧪 Testing

Running Tests

# Install test tools only (no project runtime)
uv sync --group test

# Run all tests
uv run pytest

# Run specific test file
uv run pytest test/test_filename.py

# Run with coverage
uv run pytest --cov=leann

Writing Tests

  • Place tests in the test/ directory
  • Follow the naming convention test_*.py
  • Use descriptive test names that explain what's being tested
  • Include both positive and negative test cases

📝 Code Style

We use ruff for both linting and formatting to ensure consistent code style.

Format Your Code

# Format all files
ruff format

# Check formatting without changing files
ruff format --check

Lint Your Code

# Run linter with auto-fix
ruff check --fix

# Just check without fixing
ruff check

Style Guidelines

  • Follow PEP 8 conventions
  • Use descriptive variable names
  • Add type hints where appropriate
  • Write docstrings for all public functions and classes
  • Keep functions focused and single-purpose

🚦 CI/CD

Our CI pipeline runs automatically on all pull requests. It includes:

  1. Linting and Formatting: Ensures code follows our style guidelines
  2. Multi-platform builds: Tests on Ubuntu and macOS
  3. Python version matrix: Tests on Python 3.9-3.13
  4. Wheel building: Ensures packages can be built and distributed

CI Commands

The CI uses the same commands as pre-commit to ensure consistency:

# Linting
ruff check .

# Format checking
ruff format --check .

Make sure your code passes these checks locally before pushing!

🔄 Pull Request Process

  1. Fork the repository and create your branch from main:

    git checkout -b feature/your-feature-name
    
  2. Make your changes:

    • Write clean, documented code
    • Add tests for new functionality
    • Update documentation as needed
  3. Run pre-commit checks:

    pre-commit run --all-files
    
  4. Test your changes:

    uv run pytest
    
  5. Commit with descriptive messages:

    git commit -m "feat: add new search algorithm"
    

    Follow Conventional Commits:

    • feat: for new features
    • fix: for bug fixes
    • docs: for documentation changes
    • test: for test additions/changes
    • refactor: for code refactoring
    • perf: for performance improvements
  6. Push and create a pull request:

    • Provide a clear description of your changes
    • Reference any related issues
    • Include examples or screenshots if applicable

📚 Documentation

When adding new features or making significant changes:

  1. Update relevant documentation in /docs
  2. Add docstrings to new functions/classes
  3. Update README.md if needed
  4. Include usage examples

🤔 Getting Help

  • Discord: Join our community for discussions
  • Issues: Check existing issues or create a new one
  • Discussions: For general questions and ideas

📄 License

By contributing, you agree that your contributions will be licensed under the same license as the project (MIT).


Thank you for contributing to LEANN! Every contribution, no matter how small, helps make the project better for everyone. 🌟