Initial commit
This commit is contained in:
40
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
40
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
name: Bug report
|
||||
about: Bug reports help us improve! Thanks for submitting yours!
|
||||
title: "[BUG] "
|
||||
labels: bug
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
## Expected Behavior
|
||||
Tell us what should happen
|
||||
|
||||
## Actual Behavior
|
||||
Tell us what happens instead
|
||||
|
||||
## Example Code
|
||||
Please see [How to create a Minimal, Reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) for some guidance on creating the best possible example of the problem
|
||||
```bash
|
||||
|
||||
```
|
||||
|
||||
## Dataset Description
|
||||
Please tell us about the shape and datatype of your data, (e.g. 128 dimensions, 12.3 billion points, floats)
|
||||
- Dimensions:
|
||||
- Number of Points:
|
||||
- Data type:
|
||||
|
||||
## Error
|
||||
```
|
||||
Paste the full error, with any sensitive information minimally redacted and marked $$REDACTED$$
|
||||
|
||||
```
|
||||
|
||||
## Your Environment
|
||||
* Operating system (e.g. Windows 11 Pro, Ubuntu 22.04.1 LTS)
|
||||
* DiskANN version (or commit built from)
|
||||
|
||||
## Additional Details
|
||||
Any other contextual information you might feel is important.
|
||||
|
||||
2
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/config.yml
vendored
Normal file
2
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/config.yml
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
blank_issues_enabled: false
|
||||
|
||||
25
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
25
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/feature_request.md
vendored
Normal file
@@ -0,0 +1,25 @@
|
||||
---
|
||||
name: Feature request
|
||||
about: Suggest an idea for this project
|
||||
title: ''
|
||||
labels: enhancement
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
## Is your feature request related to a problem? Please describe.
|
||||
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
|
||||
|
||||
## Describe the solution you'd like
|
||||
A clear and concise description of what you want to happen.
|
||||
|
||||
## Describe alternatives you've considered
|
||||
A clear and concise description of any alternative solutions or features you've considered.
|
||||
|
||||
## Provide references (if applicable)
|
||||
If your feature request is related to a published algorithm/idea, please provide links to
|
||||
any relevant articles or webpages.
|
||||
|
||||
## Additional context
|
||||
Add any other context or screenshots about the feature request here.
|
||||
|
||||
11
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/usage-question.md
vendored
Normal file
11
packages/leann-backend-diskann/third_party/DiskANN/.github/ISSUE_TEMPLATE/usage-question.md
vendored
Normal file
@@ -0,0 +1,11 @@
|
||||
---
|
||||
name: Usage Question
|
||||
about: Ask us a question about DiskANN!
|
||||
title: "[Question]"
|
||||
labels: question
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
This is our forum for asking whatever DiskANN question you'd like! No need to feel shy - we're happy to talk about use cases and optimal tuning strategies!
|
||||
|
||||
22
packages/leann-backend-diskann/third_party/DiskANN/.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
22
packages/leann-backend-diskann/third_party/DiskANN/.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
@@ -0,0 +1,22 @@
|
||||
<!--
|
||||
Thanks for contributing a pull request! Please ensure you have taken a look at
|
||||
the contribution guidelines: https://github.com/microsoft/DiskANN/blob/main/CONTRIBUTING.md
|
||||
-->
|
||||
- [ ] Does this PR have a descriptive title that could go in our release notes?
|
||||
- [ ] Does this PR add any new dependencies?
|
||||
- [ ] Does this PR modify any existing APIs?
|
||||
- [ ] Is the change to the API backwards compatible?
|
||||
- [ ] Should this result in any changes to our documentation, either updating existing docs or adding new ones?
|
||||
|
||||
#### Reference Issues/PRs
|
||||
<!--
|
||||
Example: Fixes #1234. See also #3456.
|
||||
Please use keywords (e.g., Fixes) to create link to the issues or pull requests
|
||||
you resolved, so that they will automatically be closed when your pull request
|
||||
is merged. See https://github.com/blog/1506-closing-issues-via-pull-requests
|
||||
-->
|
||||
|
||||
#### What does this implement/fix? Briefly explain your changes.
|
||||
|
||||
#### Any other comments?
|
||||
|
||||
39
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/build/action.yml
vendored
Normal file
39
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/build/action.yml
vendored
Normal file
@@ -0,0 +1,39 @@
|
||||
name: 'DiskANN Build Bootstrap'
|
||||
description: 'Prepares DiskANN build environment and executes build'
|
||||
runs:
|
||||
using: "composite"
|
||||
steps:
|
||||
# ------------ Linux Build ---------------
|
||||
- name: Prepare and Execute Build
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
run: |
|
||||
sudo scripts/dev/install-dev-deps-ubuntu.bash
|
||||
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DUNIT_TEST=True
|
||||
cmake --build build -- -j
|
||||
cmake --install build --prefix="dist"
|
||||
shell: bash
|
||||
# ------------ End Linux Build ---------------
|
||||
# ------------ Windows Build ---------------
|
||||
- name: Add VisualStudio command line tools into path
|
||||
if: runner.os == 'Windows'
|
||||
uses: ilammy/msvc-dev-cmd@v1
|
||||
- name: Run configure and build for Windows
|
||||
if: runner.os == 'Windows'
|
||||
run: |
|
||||
mkdir build && cd build && cmake .. -DUNIT_TEST=True && msbuild diskann.sln /m /nologo /t:Build /p:Configuration="Release" /property:Platform="x64" -consoleloggerparameters:"ErrorsOnly;Summary"
|
||||
cd ..
|
||||
mkdir dist
|
||||
mklink /j .\dist\bin .\x64\Release\
|
||||
shell: cmd
|
||||
# ------------ End Windows Build ---------------
|
||||
# ------------ Windows Build With EXEC_ENV_OLS and USE_BING_INFRA ---------------
|
||||
- name: Add VisualStudio command line tools into path
|
||||
if: runner.os == 'Windows'
|
||||
uses: ilammy/msvc-dev-cmd@v1
|
||||
- name: Run configure and build for Windows with Bing feature flags
|
||||
if: runner.os == 'Windows'
|
||||
run: |
|
||||
mkdir build_bing && cd build_bing && cmake .. -DEXEC_ENV_OLS=1 -DUSE_BING_INFRA=1 -DUNIT_TEST=True && msbuild diskann.sln /m /nologo /t:Build /p:Configuration="Release" /property:Platform="x64" -consoleloggerparameters:"ErrorsOnly;Summary"
|
||||
cd ..
|
||||
shell: cmd
|
||||
# ------------ End Windows Build ---------------
|
||||
13
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/format-check/action.yml
vendored
Normal file
13
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/format-check/action.yml
vendored
Normal file
@@ -0,0 +1,13 @@
|
||||
name: 'Checking code formatting...'
|
||||
description: 'Ensures code complies with code formatting rules'
|
||||
runs:
|
||||
using: "composite"
|
||||
steps:
|
||||
- name: Checking code formatting...
|
||||
run: |
|
||||
sudo apt install clang-format
|
||||
find include -name '*.h' -type f -print0 | xargs -0 -P 16 /usr/bin/clang-format --Werror --dry-run
|
||||
find src -name '*.cpp' -type f -print0 | xargs -0 -P 16 /usr/bin/clang-format --Werror --dry-run
|
||||
find apps -name '*.cpp' -type f -print0 | xargs -0 -P 16 /usr/bin/clang-format --Werror --dry-run
|
||||
find python -name '*.cpp' -type f -print0 | xargs -0 -P 16 /usr/bin/clang-format --Werror --dry-run
|
||||
shell: bash
|
||||
@@ -0,0 +1,28 @@
|
||||
name: 'Generating Random Data (Basic)'
|
||||
description: 'Generates the random data files used in acceptance tests'
|
||||
runs:
|
||||
using: "composite"
|
||||
steps:
|
||||
- name: Generate Random Data (Basic)
|
||||
run: |
|
||||
mkdir data
|
||||
|
||||
echo "Generating random 1020,1024,1536D float and 4096 int8 vectors for index"
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_1020D_5K_norm1.0.bin -D 1020 -N 5000 --norm 1.0
|
||||
#dist/bin/rand_data_gen --data_type float --output_file data/rand_float_1024D_5K_norm1.0.bin -D 1024 -N 5000 --norm 1.0
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_1536D_5K_norm1.0.bin -D 1536 -N 5000 --norm 1.0
|
||||
dist/bin/rand_data_gen --data_type int8 --output_file data/rand_int8_4096D_5K_norm1.0.bin -D 4096 -N 5000 --norm 1.0
|
||||
|
||||
echo "Generating random 1020,1024,1536D float and 4096D int8 avectors for query"
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_1020D_1K_norm1.0.bin -D 1020 -N 1000 --norm 1.0
|
||||
#dist/bin/rand_data_gen --data_type float --output_file data/rand_float_1024D_1K_norm1.0.bin -D 1024 -N 1000 --norm 1.0
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_1536D_1K_norm1.0.bin -D 1536 -N 1000 --norm 1.0
|
||||
dist/bin/rand_data_gen --data_type int8 --output_file data/rand_int8_4096D_1K_norm1.0.bin -D 4096 -N 1000 --norm 1.0
|
||||
|
||||
echo "Computing ground truth for 1020,1024,1536D float and 4096D int8 avectors for query"
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/rand_float_1020D_5K_norm1.0.bin --query_file data/rand_float_1020D_1K_norm1.0.bin --gt_file data/l2_rand_float_1020D_5K_norm1.0_1020D_1K_norm1.0_gt100 --K 100
|
||||
#dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/rand_float_1024D_5K_norm1.0.bin --query_file data/rand_float_1024D_1K_norm1.0.bin --gt_file data/l2_rand_float_1024D_5K_norm1.0_1024D_1K_norm1.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/rand_float_1536D_5K_norm1.0.bin --query_file data/rand_float_1536D_1K_norm1.0.bin --gt_file data/l2_rand_float_1536D_5K_norm1.0_1536D_1K_norm1.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type int8 --dist_fn l2 --base_file data/rand_int8_4096D_5K_norm1.0.bin --query_file data/rand_int8_4096D_1K_norm1.0.bin --gt_file data/l2_rand_int8_4096D_5K_norm1.0_4096D_1K_norm1.0_gt100 --K 100
|
||||
|
||||
shell: bash
|
||||
38
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/generate-random/action.yml
vendored
Normal file
38
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/generate-random/action.yml
vendored
Normal file
@@ -0,0 +1,38 @@
|
||||
name: 'Generating Random Data (Basic)'
|
||||
description: 'Generates the random data files used in acceptance tests'
|
||||
runs:
|
||||
using: "composite"
|
||||
steps:
|
||||
- name: Generate Random Data (Basic)
|
||||
run: |
|
||||
mkdir data
|
||||
|
||||
echo "Generating random vectors for index"
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_10D_10K_norm1.0.bin -D 10 -N 10000 --norm 1.0
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_10D_10K_unnorm.bin -D 10 -N 10000 --rand_scaling 2.0
|
||||
dist/bin/rand_data_gen --data_type int8 --output_file data/rand_int8_10D_10K_norm50.0.bin -D 10 -N 10000 --norm 50.0
|
||||
dist/bin/rand_data_gen --data_type uint8 --output_file data/rand_uint8_10D_10K_norm50.0.bin -D 10 -N 10000 --norm 50.0
|
||||
|
||||
echo "Generating random vectors for query"
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_10D_1K_norm1.0.bin -D 10 -N 1000 --norm 1.0
|
||||
dist/bin/rand_data_gen --data_type float --output_file data/rand_float_10D_1K_unnorm.bin -D 10 -N 1000 --rand_scaling 2.0
|
||||
dist/bin/rand_data_gen --data_type int8 --output_file data/rand_int8_10D_1K_norm50.0.bin -D 10 -N 1000 --norm 50.0
|
||||
dist/bin/rand_data_gen --data_type uint8 --output_file data/rand_uint8_10D_1K_norm50.0.bin -D 10 -N 1000 --norm 50.0
|
||||
|
||||
echo "Computing ground truth for floats across l2, mips, and cosine distance functions"
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/rand_float_10D_10K_norm1.0.bin --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn mips --base_file data/rand_float_10D_10K_norm1.0.bin --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/mips_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn cosine --base_file data/rand_float_10D_10K_norm1.0.bin --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/cosine_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn cosine --base_file data/rand_float_10D_10K_unnorm.bin --query_file data/rand_float_10D_1K_unnorm.bin --gt_file data/cosine_rand_float_10D_10K_unnorm_10D_1K_unnorm_gt100 --K 100
|
||||
|
||||
echo "Computing ground truth for int8s across l2, mips, and cosine distance functions"
|
||||
dist/bin/compute_groundtruth --data_type int8 --dist_fn l2 --base_file data/rand_int8_10D_10K_norm50.0.bin --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/l2_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type int8 --dist_fn mips --base_file data/rand_int8_10D_10K_norm50.0.bin --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/mips_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type int8 --dist_fn cosine --base_file data/rand_int8_10D_10K_norm50.0.bin --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/cosine_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
|
||||
echo "Computing ground truth for uint8s across l2, mips, and cosine distance functions"
|
||||
dist/bin/compute_groundtruth --data_type uint8 --dist_fn l2 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type uint8 --dist_fn mips --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/mips_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type uint8 --dist_fn cosine --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/cosine_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
|
||||
shell: bash
|
||||
22
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/python-wheel/action.yml
vendored
Normal file
22
packages/leann-backend-diskann/third_party/DiskANN/.github/actions/python-wheel/action.yml
vendored
Normal file
@@ -0,0 +1,22 @@
|
||||
name: Build Python Wheel
|
||||
description: Builds a python wheel with cibuildwheel
|
||||
inputs:
|
||||
cibw-identifier:
|
||||
description: "CI build wheel identifier to build"
|
||||
required: true
|
||||
runs:
|
||||
using: "composite"
|
||||
steps:
|
||||
- uses: actions/setup-python@v3
|
||||
- name: Install cibuildwheel
|
||||
run: python -m pip install cibuildwheel==2.11.3
|
||||
shell: bash
|
||||
- name: Building Python ${{inputs.cibw-identifier}} Wheel
|
||||
run: python -m cibuildwheel --output-dir dist
|
||||
env:
|
||||
CIBW_BUILD: ${{inputs.cibw-identifier}}
|
||||
shell: bash
|
||||
- uses: actions/upload-artifact@v3
|
||||
with:
|
||||
name: wheels
|
||||
path: ./dist/*.whl
|
||||
81
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/build-python-pdoc.yml
vendored
Normal file
81
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/build-python-pdoc.yml
vendored
Normal file
@@ -0,0 +1,81 @@
|
||||
name: DiskANN Build PDoc Documentation
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
build-reference-documentation:
|
||||
permissions:
|
||||
contents: write
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Set up Python 3.9
|
||||
uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: 3.9
|
||||
- name: Install python build
|
||||
run: python -m pip install build
|
||||
shell: bash
|
||||
# Install required dependencies
|
||||
- name: Prepare Linux environment
|
||||
run: |
|
||||
sudo scripts/dev/install-dev-deps-ubuntu.bash
|
||||
shell: bash
|
||||
# We need to build the wheel in order to run pdoc. pdoc does not seem to work if you just point it at
|
||||
# our source directory.
|
||||
- name: Building Python Wheel for documentation generation
|
||||
run: python -m build --wheel --outdir documentation_dist
|
||||
shell: bash
|
||||
- name: "Run Reference Documentation Generation"
|
||||
run: |
|
||||
pip install pdoc pipdeptree
|
||||
pip install documentation_dist/*.whl
|
||||
echo "documentation" > dependencies_documentation.txt
|
||||
pipdeptree >> dependencies_documentation.txt
|
||||
pdoc -o docs/python/html diskannpy
|
||||
- name: Create version environment variable
|
||||
run: |
|
||||
echo "DISKANN_VERSION=$(python <<EOF
|
||||
from importlib.metadata import version
|
||||
v = version('diskannpy')
|
||||
print(v)
|
||||
EOF
|
||||
)" >> $GITHUB_ENV
|
||||
- name: Archive documentation version artifact
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: dependencies
|
||||
path: |
|
||||
${{ github.run_id }}-dependencies_documentation.txt
|
||||
overwrite: true
|
||||
- name: Archive documentation artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: documentation-site
|
||||
path: |
|
||||
docs/python/html
|
||||
# Publish to /dev if we are on the "main" branch
|
||||
- name: Publish reference docs for latest development version (main branch)
|
||||
uses: peaceiris/actions-gh-pages@v3
|
||||
if: github.ref == 'refs/heads/main'
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
publish_dir: docs/python/html
|
||||
destination_dir: docs/python/dev
|
||||
# Publish to /<version> if we are releasing
|
||||
- name: Publish reference docs by version (main branch)
|
||||
uses: peaceiris/actions-gh-pages@v3
|
||||
if: github.event_name == 'release'
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
publish_dir: docs/python/html
|
||||
destination_dir: docs/python/${{ env.DISKANN_VERSION }}
|
||||
# Publish to /latest if we are releasing
|
||||
- name: Publish latest reference docs (main branch)
|
||||
uses: peaceiris/actions-gh-pages@v3
|
||||
if: github.event_name == 'release'
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
publish_dir: docs/python/html
|
||||
destination_dir: docs/python/latest
|
||||
42
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/build-python.yml
vendored
Normal file
42
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/build-python.yml
vendored
Normal file
@@ -0,0 +1,42 @@
|
||||
name: DiskANN Build Python Wheel
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
linux-build:
|
||||
name: Python - Ubuntu - ${{matrix.cibw-identifier}}
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
cibw-identifier: ["cp39-manylinux_x86_64", "cp310-manylinux_x86_64", "cp311-manylinux_x86_64"]
|
||||
runs-on: ubuntu-latest
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Building python wheel ${{matrix.cibw-identifier}}
|
||||
uses: ./.github/actions/python-wheel
|
||||
with:
|
||||
cibw-identifier: ${{matrix.cibw-identifier}}
|
||||
windows-build:
|
||||
name: Python - Windows - ${{matrix.cibw-identifier}}
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
cibw-identifier: ["cp39-win_amd64", "cp310-win_amd64", "cp311-win_amd64"]
|
||||
runs-on: windows-latest
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
submodules: true
|
||||
fetch-depth: 1
|
||||
- name: Building python wheel ${{matrix.cibw-identifier}}
|
||||
uses: ./.github/actions/python-wheel
|
||||
with:
|
||||
cibw-identifier: ${{matrix.cibw-identifier}}
|
||||
28
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/common.yml
vendored
Normal file
28
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/common.yml
vendored
Normal file
@@ -0,0 +1,28 @@
|
||||
name: DiskANN Common Checks
|
||||
# common means common to both pr-test and push-test
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
formatting-check:
|
||||
strategy:
|
||||
fail-fast: true
|
||||
name: Code Formatting Test
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checking code formatting...
|
||||
uses: ./.github/actions/format-check
|
||||
docker-container-build:
|
||||
name: Docker Container Build
|
||||
needs: [formatting-check]
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Docker build
|
||||
run: |
|
||||
docker build .
|
||||
117
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/disk-pq.yml
vendored
Normal file
117
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/disk-pq.yml
vendored
Normal file
@@ -0,0 +1,117 @@
|
||||
name: Disk With PQ
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-disk-pq:
|
||||
name: Disk, PQ
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Generate Data
|
||||
uses: ./.github/actions/generate-random
|
||||
|
||||
- name: build and search disk index (one shot graph build, L2, no diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskfull_oneshot -R 16 -L 32 -B 0.00003 -M 1
|
||||
dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskfull_oneshot --result_path /tmp/res --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (one shot graph build, cosine, no diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn cosine --data_path data/rand_float_10D_10K_unnorm.bin --index_path_prefix data/disk_index_cosine_rand_float_10D_10K_unnorm_diskfull_oneshot -R 16 -L 32 -B 0.00003 -M 1
|
||||
dist/bin/search_disk_index --data_type float --dist_fn cosine --fail_if_recall_below 70 --index_path_prefix data/disk_index_cosine_rand_float_10D_10K_unnorm_diskfull_oneshot --result_path /tmp/res --query_file data/rand_float_10D_1K_unnorm.bin --gt_file data/cosine_rand_float_10D_10K_unnorm_10D_1K_unnorm_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (one shot graph build, L2, no diskPQ) (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskfull_oneshot -R 16 -L 32 -B 0.00003 -M 1
|
||||
dist/bin/search_disk_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskfull_oneshot --result_path /tmp/res --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/l2_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (one shot graph build, L2, no diskPQ) (uint8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskfull_oneshot -R 16 -L 32 -B 0.00003 -M 1
|
||||
dist/bin/search_disk_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskfull_oneshot --result_path /tmp/res --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
|
||||
- name: build and search disk index (one shot graph build, L2, no diskPQ, build with PQ distance comparisons) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskfull_oneshot_buildpq5 -R 16 -L 32 -B 0.00003 -M 1 --build_PQ_bytes 5
|
||||
dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskfull_oneshot_buildpq5 --result_path /tmp/res --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (one shot graph build, L2, no diskPQ, build with PQ distance comparisons) (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskfull_oneshot_buildpq5 -R 16 -L 32 -B 0.00003 -M 1 --build_PQ_bytes 5
|
||||
dist/bin/search_disk_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskfull_oneshot_buildpq5 --result_path /tmp/res --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/l2_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16\
|
||||
- name: build and search disk index (one shot graph build, L2, no diskPQ, build with PQ distance comparisons) (uint8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskfull_oneshot_buildpq5 -R 16 -L 32 -B 0.00003 -M 1 --build_PQ_bytes 5
|
||||
dist/bin/search_disk_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskfull_oneshot_buildpq5 --result_path /tmp/res --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
|
||||
- name: build and search disk index (sharded graph build, L2, no diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskfull_sharded -R 16 -L 32 -B 0.00003 -M 0.00006
|
||||
dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskfull_sharded --result_path /tmp/res --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (sharded graph build, cosine, no diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn cosine --data_path data/rand_float_10D_10K_unnorm.bin --index_path_prefix data/disk_index_cosine_rand_float_10D_10K_unnorm_diskfull_sharded -R 16 -L 32 -B 0.00003 -M 0.00006
|
||||
dist/bin/search_disk_index --data_type float --dist_fn cosine --fail_if_recall_below 70 --index_path_prefix data/disk_index_cosine_rand_float_10D_10K_unnorm_diskfull_sharded --result_path /tmp/res --query_file data/rand_float_10D_1K_unnorm.bin --gt_file data/cosine_rand_float_10D_10K_unnorm_10D_1K_unnorm_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (sharded graph build, L2, no diskPQ) (int8)
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskfull_sharded -R 16 -L 32 -B 0.00003 -M 0.00006
|
||||
dist/bin/search_disk_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskfull_sharded --result_path /tmp/res --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/l2_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (sharded graph build, L2, no diskPQ) (uint8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskfull_sharded -R 16 -L 32 -B 0.00003 -M 0.00006
|
||||
dist/bin/search_disk_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskfull_sharded --result_path /tmp/res --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
|
||||
- name: build and search disk index (one shot graph build, L2, diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskpq_oneshot -R 16 -L 32 -B 0.00003 -M 1 --PQ_disk_bytes 5
|
||||
dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_float_10D_10K_norm1.0_diskpq_oneshot --result_path /tmp/res --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (one shot graph build, L2, diskPQ) (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskpq_oneshot -R 16 -L 32 -B 0.00003 -M 1 --PQ_disk_bytes 5
|
||||
dist/bin/search_disk_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_int8_10D_10K_norm50.0_diskpq_oneshot --result_path /tmp/res --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/l2_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search disk index (one shot graph build, L2, diskPQ) (uint8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskpq_oneshot -R 16 -L 32 -B 0.00003 -M 1 --PQ_disk_bytes 5
|
||||
dist/bin/search_disk_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50.0_diskpq_oneshot --result_path /tmp/res --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
|
||||
- name: build and search disk index (sharded graph build, MIPS, diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn mips --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/disk_index_mips_rand_float_10D_10K_norm1.0_diskpq_sharded -R 16 -L 32 -B 0.00003 -M 0.00006 --PQ_disk_bytes 5
|
||||
dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_mips_rand_float_10D_10K_norm1.0_diskpq_sharded --result_path /tmp/res --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/mips_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
|
||||
- name: upload data and bin
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: disk-pq-${{matrix.os}}
|
||||
path: |
|
||||
./dist/**
|
||||
./data/**
|
||||
102
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/dynamic-labels.yml
vendored
Normal file
102
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/dynamic-labels.yml
vendored
Normal file
@@ -0,0 +1,102 @@
|
||||
name: Dynamic-Labels
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-dynamic:
|
||||
name: Dynamic-Labels
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Generate Data
|
||||
uses: ./.github/actions/generate-random
|
||||
|
||||
- name: Generate Labels
|
||||
run: |
|
||||
echo "Generating synthetic labels and computing ground truth for filtered search with universal label"
|
||||
dist/bin/generate_synthetic_labels --num_labels 50 --num_points 10000 --output_file data/rand_labels_50_10K.txt --distribution_type random
|
||||
|
||||
echo "Generating synthetic labels with a zipf distribution and computing ground truth for filtered search with universal label"
|
||||
dist/bin/generate_synthetic_labels --num_labels 50 --num_points 10000 --output_file data/zipf_labels_50_10K.txt --distribution_type zipf
|
||||
|
||||
- name: Test a streaming index (float) with labels (Zipf distributed)
|
||||
run: |
|
||||
dist/bin/test_streaming_scenario --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --universal_label 0 --label_file data/zipf_labels_50_10K.txt --index_path_prefix data/index_zipf_stream -R 64 --FilteredLbuild 200 -L 50 --alpha 1.2 --insert_threads 8 --consolidate_threads 8 --max_points_to_insert 10000 --active_window 4000 --consolidate_interval 2000 --start_point_norm 3.2 --unique_labels_supported 51
|
||||
|
||||
echo "Computing groundtruth with filter"
|
||||
dist/bin/compute_groundtruth_for_filters --data_type float --universal_label 0 --filter_label 1 --dist_fn l2 --base_file data/index_zipf_stream.after-streaming-act4000-cons2000-max10000.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_zipf_base-act4000-cons2000-max10000_1 --label_file data/index_zipf_stream.after-streaming-act4000-cons2000-max10000_raw_labels.txt --tags_file data/index_zipf_stream.after-streaming-act4000-cons2000-max10000.tags
|
||||
echo "Searching with filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --filter_label 1 --fail_if_recall_below 40 --index_path_prefix data/index_zipf_stream.after-streaming-act4000-cons2000-max10000 --result_path data/res_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_zipf_base-act4000-cons2000-max10000_1 -K 10 -L 20 40 60 80 100 150 -T 64 --dynamic true --tags 1
|
||||
|
||||
echo "Computing groundtruth w/o filter"
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/index_zipf_stream.after-streaming-act4000-cons2000-max10000.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_zipf_base-act4000-cons2000-max10000
|
||||
echo "Searching without filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_zipf_stream.after-streaming-act4000-cons2000-max10000 --result_path res_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_zipf_base-act4000-cons2000-max10000 -K 10 -L 20 40 60 80 100 -T 64
|
||||
|
||||
- name: Test a streaming index (float) with labels (random distributed)
|
||||
run: |
|
||||
dist/bin/test_streaming_scenario --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --universal_label 0 --label_file data/rand_labels_50_10K.txt --index_path_prefix data/index_rand_stream -R 64 --FilteredLbuild 200 -L 50 --alpha 1.2 --insert_threads 8 --consolidate_threads 8 --max_points_to_insert 10000 --active_window 4000 --consolidate_interval 2000 --start_point_norm 3.2 --unique_labels_supported 51
|
||||
|
||||
echo "Computing groundtruth with filter"
|
||||
dist/bin/compute_groundtruth_for_filters --data_type float --universal_label 0 --filter_label 1 --dist_fn l2 --base_file data/index_rand_stream.after-streaming-act4000-cons2000-max10000.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_rand_base-act4000-cons2000-max10000_1 --label_file data/index_rand_stream.after-streaming-act4000-cons2000-max10000_raw_labels.txt --tags_file data/index_rand_stream.after-streaming-act4000-cons2000-max10000.tags
|
||||
echo "Searching with filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --filter_label 1 --fail_if_recall_below 40 --index_path_prefix data/index_rand_stream.after-streaming-act4000-cons2000-max10000 --result_path data/res_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_rand_base-act4000-cons2000-max10000_1 -K 10 -L 20 40 60 80 100 150 -T 64 --dynamic true --tags 1
|
||||
|
||||
echo "Computing groundtruth w/o filter"
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/index_rand_stream.after-streaming-act4000-cons2000-max10000.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_rand_base-act4000-cons2000-max10000
|
||||
echo "Searching without filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_rand_stream.after-streaming-act4000-cons2000-max10000 --result_path res_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_rand_base-act4000-cons2000-max10000 -K 10 -L 20 40 60 80 100 -T 64
|
||||
|
||||
- name: Test Insert Delete Consolidate (float) with labels (zipf distributed)
|
||||
run: |
|
||||
dist/bin/test_insert_deletes_consolidate --data_type float --dist_fn l2 --universal_label 0 --label_file data/zipf_labels_50_10K.txt --FilteredLbuild 70 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_zipf_ins_del -R 64 -L 10 --alpha 1.2 --points_to_skip 0 --max_points_to_insert 7500 --beginning_index_size 0 --points_per_checkpoint 1000 --checkpoints_per_snapshot 0 --points_to_delete_from_beginning 2500 --start_deletes_after 5000 --do_concurrent true --start_point_norm 3.2 --unique_labels_supported 51
|
||||
|
||||
echo "Computing groundtruth with filter"
|
||||
dist/bin/compute_groundtruth_for_filters --data_type float --filter_label 5 --universal_label 0 --dist_fn l2 --base_file data/index_zipf_ins_del.after-concurrent-delete-del2500-7500.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_zipf_random10D_1K_wlabel_5 --label_file data/index_zipf_ins_del.after-concurrent-delete-del2500-7500_raw_labels.txt --tags_file data/index_zipf_ins_del.after-concurrent-delete-del2500-7500.tags
|
||||
echo "Searching with filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --filter_label 5 --fail_if_recall_below 10 --index_path_prefix data/index_zipf_ins_del.after-concurrent-delete-del2500-7500 --result_path data/res_zipf_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_zipf_random10D_1K_wlabel_5 -K 10 -L 20 40 60 80 100 150 -T 64 --dynamic true --tags 1
|
||||
|
||||
echo "Computing groundtruth w/o filter"
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/index_zipf_ins_del.after-concurrent-delete-del2500-7500.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_zipf_random10D_1K
|
||||
echo "Searching without filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_zipf_ins_del.after-concurrent-delete-del2500-7500 --result_path res_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_zipf_random10D_1K -K 10 -L 20 40 60 80 100 -T 64
|
||||
|
||||
- name: Test Insert Delete Consolidate (float) with labels (random distributed)
|
||||
run: |
|
||||
dist/bin/test_insert_deletes_consolidate --data_type float --dist_fn l2 --universal_label 0 --label_file data/rand_labels_50_10K.txt --FilteredLbuild 70 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_rand_ins_del -R 64 -L 10 --alpha 1.2 --points_to_skip 0 --max_points_to_insert 7500 --beginning_index_size 0 --points_per_checkpoint 1000 --checkpoints_per_snapshot 0 --points_to_delete_from_beginning 2500 --start_deletes_after 5000 --do_concurrent true --start_point_norm 3.2 --unique_labels_supported 51
|
||||
|
||||
echo "Computing groundtruth with filter"
|
||||
dist/bin/compute_groundtruth_for_filters --data_type float --filter_label 5 --universal_label 0 --dist_fn l2 --base_file data/index_rand_ins_del.after-concurrent-delete-del2500-7500.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_rand_random10D_1K_wlabel_5 --label_file data/index_rand_ins_del.after-concurrent-delete-del2500-7500_raw_labels.txt --tags_file data/index_rand_ins_del.after-concurrent-delete-del2500-7500.tags
|
||||
echo "Searching with filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --filter_label 5 --fail_if_recall_below 40 --index_path_prefix data/index_rand_ins_del.after-concurrent-delete-del2500-7500 --result_path data/res_rand_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_rand_random10D_1K_wlabel_5 -K 10 -L 20 40 60 80 100 150 -T 64 --dynamic true --tags 1
|
||||
|
||||
echo "Computing groundtruth w/o filter"
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/index_rand_ins_del.after-concurrent-delete-del2500-7500.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_rand_random10D_1K
|
||||
echo "Searching without filter"
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_rand_ins_del.after-concurrent-delete-del2500-7500 --result_path res_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_rand_random10D_1K -K 10 -L 20 40 60 80 100 -T 64
|
||||
|
||||
- name: upload data and bin
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: dynamic-labels-${{matrix.os}}
|
||||
path: |
|
||||
./dist/**
|
||||
./data/**
|
||||
75
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/dynamic.yml
vendored
Normal file
75
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/dynamic.yml
vendored
Normal file
@@ -0,0 +1,75 @@
|
||||
name: Dynamic
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-dynamic:
|
||||
name: Dynamic
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Generate Data
|
||||
uses: ./.github/actions/generate-random
|
||||
|
||||
- name: test a streaming index (float)
|
||||
run: |
|
||||
dist/bin/test_streaming_scenario --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_stream -R 64 -L 600 --alpha 1.2 --insert_threads 4 --consolidate_threads 4 --max_points_to_insert 10000 --active_window 4000 --consolidate_interval 2000 --start_point_norm 3.2
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/index_stream.after-streaming-act4000-cons2000-max10000.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_base-act4000-cons2000-max10000 --tags_file data/index_stream.after-streaming-act4000-cons2000-max10000.tags
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_stream.after-streaming-act4000-cons2000-max10000 --result_path data/res_stream --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_base-act4000-cons2000-max10000 -K 10 -L 20 40 60 80 100 -T 64 --dynamic true --tags 1
|
||||
- name: test a streaming index (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/test_streaming_scenario --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/index_stream -R 64 -L 600 --alpha 1.2 --insert_threads 4 --consolidate_threads 4 --max_points_to_insert 10000 --active_window 4000 --consolidate_interval 2000 --start_point_norm 200
|
||||
dist/bin/compute_groundtruth --data_type int8 --dist_fn l2 --base_file data/index_stream.after-streaming-act4000-cons2000-max10000.data --query_file data/rand_int8_10D_1K_norm50.0.bin --K 100 --gt_file data/gt100_base-act4000-cons2000-max10000 --tags_file data/index_stream.after-streaming-act4000-cons2000-max10000.tags
|
||||
dist/bin/search_memory_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_stream.after-streaming-act4000-cons2000-max10000 --result_path res_stream --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/gt100_base-act4000-cons2000-max10000 -K 10 -L 20 40 60 80 100 -T 64 --dynamic true --tags 1
|
||||
- name: test a streaming index
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/test_streaming_scenario --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/index_stream -R 64 -L 600 --alpha 1.2 --insert_threads 4 --consolidate_threads 4 --max_points_to_insert 10000 --active_window 4000 --consolidate_interval 2000 --start_point_norm 200
|
||||
dist/bin/compute_groundtruth --data_type uint8 --dist_fn l2 --base_file data/index_stream.after-streaming-act4000-cons2000-max10000.data --query_file data/rand_uint8_10D_1K_norm50.0.bin --K 100 --gt_file data/gt100_base-act4000-cons2000-max10000 --tags_file data/index_stream.after-streaming-act4000-cons2000-max10000.tags
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_stream.after-streaming-act4000-cons2000-max10000 --result_path data/res_stream --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/gt100_base-act4000-cons2000-max10000 -K 10 -L 20 40 60 80 100 -T 64 --dynamic true --tags 1
|
||||
|
||||
- name: build and search an incremental index (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/test_insert_deletes_consolidate --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_ins_del -R 64 -L 300 --alpha 1.2 -T 8 --points_to_skip 0 --max_points_to_insert 7500 --beginning_index_size 0 --points_per_checkpoint 1000 --checkpoints_per_snapshot 0 --points_to_delete_from_beginning 2500 --start_deletes_after 5000 --do_concurrent true --start_point_norm 3.2;
|
||||
dist/bin/compute_groundtruth --data_type float --dist_fn l2 --base_file data/index_ins_del.after-concurrent-delete-del2500-7500.data --query_file data/rand_float_10D_1K_norm1.0.bin --K 100 --gt_file data/gt100_random10D_1K-conc-2500-7500 --tags_file data/index_ins_del.after-concurrent-delete-del2500-7500.tags
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_ins_del.after-concurrent-delete-del2500-7500 --result_path data/res_ins_del --query_file data/rand_float_10D_1K_norm1.0.bin --gt_file data/gt100_random10D_1K-conc-2500-7500 -K 10 -L 20 40 60 80 100 -T 8 --dynamic true --tags 1
|
||||
- name: build and search an incremental index (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/test_insert_deletes_consolidate --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/index_ins_del -R 64 -L 300 --alpha 1.2 -T 8 --points_to_skip 0 --max_points_to_insert 7500 --beginning_index_size 0 --points_per_checkpoint 1000 --checkpoints_per_snapshot 0 --points_to_delete_from_beginning 2500 --start_deletes_after 5000 --do_concurrent true --start_point_norm 200
|
||||
dist/bin/compute_groundtruth --data_type int8 --dist_fn l2 --base_file data/index_ins_del.after-concurrent-delete-del2500-7500.data --query_file data/rand_int8_10D_1K_norm50.0.bin --K 100 --gt_file data/gt100_random10D_1K-conc-2500-7500 --tags_file data/index_ins_del.after-concurrent-delete-del2500-7500.tags
|
||||
dist/bin/search_memory_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_ins_del.after-concurrent-delete-del2500-7500 --result_path data/res_ins_del --query_file data/rand_int8_10D_1K_norm50.0.bin --gt_file data/gt100_random10D_1K-conc-2500-7500 -K 10 -L 20 40 60 80 100 -T 8 --dynamic true --tags 1
|
||||
- name: build and search an incremental index (uint8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/test_insert_deletes_consolidate --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/index_ins_del -R 64 -L 300 --alpha 1.2 -T 8 --points_to_skip 0 --max_points_to_insert 7500 --beginning_index_size 0 --points_per_checkpoint 1000 --checkpoints_per_snapshot 0 --points_to_delete_from_beginning 2500 --start_deletes_after 5000 --do_concurrent true --start_point_norm 200
|
||||
dist/bin/compute_groundtruth --data_type uint8 --dist_fn l2 --base_file data/index_ins_del.after-concurrent-delete-del2500-7500.data --query_file data/rand_uint8_10D_1K_norm50.0.bin --K 100 --gt_file data/gt100_random10D_10K-conc-2500-7500 --tags_file data/index_ins_del.after-concurrent-delete-del2500-7500.tags
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_ins_del.after-concurrent-delete-del2500-7500 --result_path data/res_ins_del --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/gt100_random10D_10K-conc-2500-7500 -K 10 -L 20 40 60 80 100 -T 8 --dynamic true --tags 1
|
||||
|
||||
- name: upload data and bin
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: dynamic-${{matrix.os}}
|
||||
path: |
|
||||
./dist/**
|
||||
./data/**
|
||||
81
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/in-mem-no-pq.yml
vendored
Normal file
81
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/in-mem-no-pq.yml
vendored
Normal file
@@ -0,0 +1,81 @@
|
||||
name: In-Memory Without PQ
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-mem-no-pq:
|
||||
name: In-Mem, Without PQ
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Generate Data
|
||||
uses: ./.github/actions/generate-random
|
||||
|
||||
- name: build and search in-memory index with L2 metrics (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_l2_rand_float_10D_10K_norm1.0
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_float_10D_10K_norm1.0 --query_file data/rand_float_10D_1K_norm1.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 -L 16 32
|
||||
- name: build and search in-memory index with L2 metrics (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/index_l2_rand_int8_10D_10K_norm50.0
|
||||
dist/bin/search_memory_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_int8_10D_10K_norm50.0 --query_file data/rand_int8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 16 32
|
||||
- name: build and search in-memory index with L2 metrics (uint8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50.0
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50.0 --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 16 32
|
||||
|
||||
- name: Searching with fast_l2 distance function (float)
|
||||
if: runner.os != 'Windows' && (success() || failure())
|
||||
run: |
|
||||
dist/bin/search_memory_index --data_type float --dist_fn fast_l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_float_10D_10K_norm1.0 --query_file data/rand_float_10D_1K_norm1.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 -L 16 32
|
||||
|
||||
- name: build and search in-memory index with MIPS metric (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type float --dist_fn mips --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_mips_rand_float_10D_10K_norm1.0
|
||||
dist/bin/search_memory_index --data_type float --dist_fn mips --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_float_10D_10K_norm1.0 --query_file data/rand_float_10D_1K_norm1.0.bin --recall_at 10 --result_path temp --gt_file data/mips_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 -L 16 32
|
||||
|
||||
- name: build and search in-memory index with cosine metric (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type float --dist_fn cosine --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_cosine_rand_float_10D_10K_norm1.0
|
||||
dist/bin/search_memory_index --data_type float --dist_fn cosine --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_float_10D_10K_norm1.0 --query_file data/rand_float_10D_1K_norm1.0.bin --recall_at 10 --result_path temp --gt_file data/cosine_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 -L 16 32
|
||||
- name: build and search in-memory index with cosine metric (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type int8 --dist_fn cosine --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/index_cosine_rand_int8_10D_10K_norm50.0
|
||||
dist/bin/search_memory_index --data_type int8 --dist_fn cosine --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_int8_10D_10K_norm50.0 --query_file data/rand_int8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/cosine_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 16 32
|
||||
- name: build and search in-memory index with cosine metric
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn cosine --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/index_cosine_rand_uint8_10D_10K_norm50.0
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn cosine --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50.0 --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/cosine_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 16 32
|
||||
|
||||
- name: upload data and bin
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: in-memory-no-pq-${{matrix.os}}
|
||||
path: |
|
||||
./dist/**
|
||||
./data/**
|
||||
56
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/in-mem-pq.yml
vendored
Normal file
56
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/in-mem-pq.yml
vendored
Normal file
@@ -0,0 +1,56 @@
|
||||
name: In-Memory With PQ
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-mem-pq:
|
||||
name: In-Mem, PQ
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Generate Data
|
||||
uses: ./.github/actions/generate-random
|
||||
|
||||
- name: build and search in-memory index with L2 metric with PQ based distance comparisons (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type float --dist_fn l2 --data_path data/rand_float_10D_10K_norm1.0.bin --index_path_prefix data/index_l2_rand_float_10D_10K_norm1.0_buildpq5 --build_PQ_bytes 5
|
||||
dist/bin/search_memory_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_float_10D_10K_norm1.0_buildpq5 --query_file data/rand_float_10D_1K_norm1.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_float_10D_10K_norm1.0_10D_1K_norm1.0_gt100 -L 16 32
|
||||
|
||||
- name: build and search in-memory index with L2 metrics with PQ base distance comparisons (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type int8 --dist_fn l2 --data_path data/rand_int8_10D_10K_norm50.0.bin --index_path_prefix data/index_l2_rand_int8_10D_10K_norm50.0_buildpq5 --build_PQ_bytes 5
|
||||
dist/bin/search_memory_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_int8_10D_10K_norm50.0_buildpq5 --query_file data/rand_int8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_int8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 16 32
|
||||
|
||||
- name: build and search in-memory index with L2 metrics with PQ base distance comparisons (uint8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn l2 --data_path data/rand_uint8_10D_10K_norm50.0.bin --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50.0_buildpq5 --build_PQ_bytes 5
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50.0_buildpq5 --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 16 32
|
||||
|
||||
- name: upload data and bin
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: in-memory-pq-${{matrix.os}}
|
||||
path: |
|
||||
./dist/**
|
||||
./data/**
|
||||
120
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/labels.yml
vendored
Normal file
120
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/labels.yml
vendored
Normal file
@@ -0,0 +1,120 @@
|
||||
name: Labels
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-labels:
|
||||
name: Labels
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Generate Data
|
||||
uses: ./.github/actions/generate-random
|
||||
|
||||
- name: Generate Labels
|
||||
run: |
|
||||
echo "Generating synthetic labels and computing ground truth for filtered search with universal label"
|
||||
dist/bin/generate_synthetic_labels --num_labels 50 --num_points 10000 --output_file data/rand_labels_50_10K.txt --distribution_type random
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn l2 --universal_label 0 --filter_label 10 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/rand_labels_50_10K.txt --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --K 100
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn mips --universal_label 0 --filter_label 10 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/rand_labels_50_10K.txt --gt_file data/mips_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --K 100
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn cosine --universal_label 0 --filter_label 10 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/rand_labels_50_10K.txt --gt_file data/cosine_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --K 100
|
||||
|
||||
echo "Generating synthetic labels with a zipf distribution and computing ground truth for filtered search with universal label"
|
||||
dist/bin/generate_synthetic_labels --num_labels 50 --num_points 10000 --output_file data/zipf_labels_50_10K.txt --distribution_type zipf
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn l2 --universal_label 0 --filter_label 5 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --K 100
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn mips --universal_label 0 --filter_label 5 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --gt_file data/mips_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --K 100
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn cosine --universal_label 0 --filter_label 5 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --gt_file data/cosine_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --K 100
|
||||
|
||||
echo "Generating synthetic labels and computing ground truth for filtered search without a universal label"
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn l2 --filter_label 5 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel_nouniversal --K 100
|
||||
dist/bin/generate_synthetic_labels --num_labels 10 --num_points 1000 --output_file data/query_labels_1K.txt --distribution_type one_per_point
|
||||
dist/bin/compute_groundtruth_for_filters --data_type uint8 --dist_fn l2 --universal_label 0 --filter_label_file data/query_labels_1K.txt --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --gt_file data/combined_l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --K 100
|
||||
|
||||
- name: build and search in-memory index with labels using L2 and Cosine metrics (random distributed labels)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn l2 --FilteredLbuild 90 --universal_label 0 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/rand_labels_50_10K.txt --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50_wlabel
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn cosine --FilteredLbuild 90 --universal_label 0 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/rand_labels_50_10K.txt --index_path_prefix data/index_cosine_rand_uint8_10D_10K_norm50_wlabel
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --filter_label 10 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -L 16 32
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn cosine --filter_label 10 --fail_if_recall_below 70 --index_path_prefix data/index_cosine_rand_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/cosine_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -L 16 32
|
||||
|
||||
echo "Searching without filters"
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 32 64
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn cosine --fail_if_recall_below 70 --index_path_prefix data/index_cosine_rand_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/cosine_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 32 64
|
||||
|
||||
- name: build and search disk index with labels using L2 and Cosine metrics (random distributed labels)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type uint8 --dist_fn l2 --universal_label 0 --FilteredLbuild 90 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/rand_labels_50_10K.txt --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50_wlabel -R 32 -L 5 -B 0.00003 -M 1
|
||||
dist/bin/search_disk_index --data_type uint8 --dist_fn l2 --filter_label 10 --fail_if_recall_below 50 --index_path_prefix data/disk_index_l2_rand_uint8_10D_10K_norm50_wlabel --result_path temp --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: build and search in-memory index with labels using L2 and Cosine metrics (zipf distributed labels)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn l2 --FilteredLbuild 90 --universal_label 0 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --index_path_prefix data/index_l2_zipf_uint8_10D_10K_norm50_wlabel
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn cosine --FilteredLbuild 90 --universal_label 0 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --index_path_prefix data/index_cosine_zipf_uint8_10D_10K_norm50_wlabel
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --filter_label 5 --fail_if_recall_below 70 --index_path_prefix data/index_l2_zipf_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -L 16 32
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn cosine --filter_label 5 --fail_if_recall_below 70 --index_path_prefix data/index_cosine_zipf_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/cosine_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -L 16 32
|
||||
|
||||
echo "Searching without filters"
|
||||
dist/bin/compute_groundtruth --data_type uint8 --dist_fn l2 --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
dist/bin/compute_groundtruth --data_type uint8 --dist_fn cosine --base_file data/rand_uint8_10D_10K_norm50.0.bin --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/cosine_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 --K 100
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/index_l2_zipf_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 32 64
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn cosine --fail_if_recall_below 70 --index_path_prefix data/index_cosine_zipf_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/cosine_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100 -L 32 64
|
||||
|
||||
- name: build and search disk index with labels using L2 and Cosine metrics (zipf distributed labels)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type uint8 --dist_fn l2 --universal_label 0 --FilteredLbuild 90 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --index_path_prefix data/disk_index_l2_zipf_uint8_10D_10K_norm50_wlabel -R 32 -L 5 -B 0.00003 -M 1
|
||||
dist/bin/search_disk_index --data_type uint8 --dist_fn l2 --filter_label 5 --fail_if_recall_below 50 --index_path_prefix data/disk_index_l2_zipf_uint8_10D_10K_norm50_wlabel --result_path temp --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
|
||||
- name : build and search in-memory and disk index (without universal label, zipf distributed)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn l2 --FilteredLbuild 90 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --index_path_prefix data/index_l2_zipf_uint8_10D_10K_norm50_wlabel_nouniversal
|
||||
dist/bin/build_disk_index --data_type uint8 --dist_fn l2 --FilteredLbuild 90 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --index_path_prefix data/disk_index_l2_zipf_uint8_10D_10K_norm50_wlabel_nouniversal -R 32 -L 5 -B 0.00003 -M 1
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --filter_label 5 --fail_if_recall_below 70 --index_path_prefix data/index_l2_zipf_uint8_10D_10K_norm50_wlabel_nouniversal --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel_nouniversal -L 16 32
|
||||
dist/bin/search_disk_index --data_type uint8 --dist_fn l2 --filter_label 5 --index_path_prefix data/disk_index_l2_zipf_uint8_10D_10K_norm50_wlabel_nouniversal --result_path temp --query_file data/rand_uint8_10D_1K_norm50.0.bin --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel_nouniversal --recall_at 5 -L 5 12 -W 2 --num_nodes_to_cache 10 -T 16
|
||||
- name: Generate combined GT for each query with a separate label and search
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn l2 --FilteredLbuild 90 --universal_label 0 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt --index_path_prefix data/index_l2_zipf_uint8_10D_10K_norm50_wlabel
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --query_filters_file data/query_labels_1K.txt --fail_if_recall_below 70 --index_path_prefix data/index_l2_zipf_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/combined_l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -L 16 32
|
||||
- name: build and search in-memory index with pq_dist of 5 with 10 dimensions
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_memory_index --data_type uint8 --dist_fn l2 --FilteredLbuild 90 --universal_label 0 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/rand_labels_50_10K.txt --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50_wlabel --build_PQ_bytes 5
|
||||
dist/bin/search_memory_index --data_type uint8 --dist_fn l2 --filter_label 10 --fail_if_recall_below 70 --index_path_prefix data/index_l2_rand_uint8_10D_10K_norm50_wlabel --query_file data/rand_uint8_10D_1K_norm50.0.bin --recall_at 10 --result_path temp --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -L 16 32
|
||||
- name: Build and search stitched vamana with random and zipf distributed labels
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_stitched_index --num_threads 48 --data_type uint8 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/rand_labels_50_10K.txt -R 32 -L 100 --alpha 1.2 --stitched_R 64 --index_path_prefix data/stit_rand_32_100_64_new --universal_label 0
|
||||
dist/bin/build_stitched_index --num_threads 48 --data_type uint8 --data_path data/rand_uint8_10D_10K_norm50.0.bin --label_file data/zipf_labels_50_10K.txt -R 32 -L 100 --alpha 1.2 --stitched_R 64 --index_path_prefix data/stit_zipf_32_100_64_new --universal_label 0
|
||||
dist/bin/search_memory_index --num_threads 48 --data_type uint8 --dist_fn l2 --filter_label 10 --index_path_prefix data/stit_rand_32_100_64_new --query_file data/rand_uint8_10D_1K_norm50.0.bin --result_path data/rand_stit_96_10_90_new --gt_file data/l2_rand_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -K 10 -L 16 32 150
|
||||
dist/bin/search_memory_index --num_threads 48 --data_type uint8 --dist_fn l2 --filter_label 5 --index_path_prefix data/stit_zipf_32_100_64_new --query_file data/rand_uint8_10D_1K_norm50.0.bin --result_path data/zipf_stit_96_10_90_new --gt_file data/l2_zipf_uint8_10D_10K_norm50.0_10D_1K_norm50.0_gt100_wlabel -K 10 -L 16 32 150
|
||||
|
||||
- name: upload data and bin
|
||||
if: success() || failure()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: labels-${{matrix.os}}
|
||||
path: |
|
||||
./dist/**
|
||||
./data/**
|
||||
60
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/multi-sector-disk-pq.yml
vendored
Normal file
60
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/multi-sector-disk-pq.yml
vendored
Normal file
@@ -0,0 +1,60 @@
|
||||
name: Disk With PQ
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-disk-pq:
|
||||
name: Disk, PQ
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Generate Data
|
||||
uses: ./.github/actions/generate-high-dim-random
|
||||
|
||||
- name: build and search disk index (1020D, one shot graph build, L2, no diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn l2 --data_path data/rand_float_1020D_5K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_float_1020D_5K_norm1.0_diskfull_oneshot -R 32 -L 500 -B 0.003 -M 1
|
||||
dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_float_1020D_5K_norm1.0_diskfull_oneshot --result_path /tmp/res --query_file data/rand_float_1020D_1K_norm1.0.bin --gt_file data/l2_rand_float_1020D_5K_norm1.0_1020D_1K_norm1.0_gt100 --recall_at 5 -L 250 -W 2 --num_nodes_to_cache 100 -T 16
|
||||
#- name: build and search disk index (1024D, one shot graph build, L2, no diskPQ) (float)
|
||||
# if: success() || failure()
|
||||
# run: |
|
||||
# dist/bin/build_disk_index --data_type float --dist_fn l2 --data_path data/rand_float_1024D_5K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_float_1024D_5K_norm1.0_diskfull_oneshot -R 32 -L 500 -B 0.003 -M 1
|
||||
# dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_float_1024D_5K_norm1.0_diskfull_oneshot --result_path /tmp/res --query_file data/rand_float_1024D_1K_norm1.0.bin --gt_file data/l2_rand_float_1024D_5K_norm1.0_1024D_1K_norm1.0_gt100 --recall_at 5 -L 250 -W 2 --num_nodes_to_cache 100 -T 16
|
||||
- name: build and search disk index (1536D, one shot graph build, L2, no diskPQ) (float)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type float --dist_fn l2 --data_path data/rand_float_1536D_5K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_float_1536D_5K_norm1.0_diskfull_oneshot -R 32 -L 500 -B 0.003 -M 1
|
||||
dist/bin/search_disk_index --data_type float --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_float_1536D_5K_norm1.0_diskfull_oneshot --result_path /tmp/res --query_file data/rand_float_1536D_1K_norm1.0.bin --gt_file data/l2_rand_float_1536D_5K_norm1.0_1536D_1K_norm1.0_gt100 --recall_at 5 -L 250 -W 2 --num_nodes_to_cache 100 -T 16
|
||||
|
||||
- name: build and search disk index (4096D, one shot graph build, L2, no diskPQ) (int8)
|
||||
if: success() || failure()
|
||||
run: |
|
||||
dist/bin/build_disk_index --data_type int8 --dist_fn l2 --data_path data/rand_int8_4096D_5K_norm1.0.bin --index_path_prefix data/disk_index_l2_rand_int8_4096D_5K_norm1.0_diskfull_oneshot -R 32 -L 500 -B 0.003 -M 1
|
||||
dist/bin/search_disk_index --data_type int8 --dist_fn l2 --fail_if_recall_below 70 --index_path_prefix data/disk_index_l2_rand_int8_4096D_5K_norm1.0_diskfull_oneshot --result_path /tmp/res --query_file data/rand_int8_4096D_1K_norm1.0.bin --gt_file data/l2_rand_int8_4096D_5K_norm1.0_4096D_1K_norm1.0_gt100 --recall_at 5 -L 250 -W 2 --num_nodes_to_cache 100 -T 16
|
||||
|
||||
- name: upload data and bin
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: multi-sector-disk-pq-${{matrix.os}}
|
||||
path: |
|
||||
./dist/**
|
||||
./data/**
|
||||
26
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/perf.yml
vendored
Normal file
26
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/perf.yml
vendored
Normal file
@@ -0,0 +1,26 @@
|
||||
name: DiskANN Nightly Performance Metrics
|
||||
on:
|
||||
schedule:
|
||||
- cron: "41 14 * * *" # 14:41 UTC, 7:41 PDT, 8:41 PST, 08:11 IST
|
||||
jobs:
|
||||
perf-test:
|
||||
name: Run Perf Test from main
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Build Perf Container
|
||||
run: |
|
||||
docker build --build-arg GIT_COMMIT_ISH="$GITHUB_SHA" -t perf -f scripts/perf/Dockerfile scripts
|
||||
- name: Performance Tests
|
||||
run: |
|
||||
mkdir metrics
|
||||
docker run -v ./metrics:/app/logs perf &> ./metrics/combined_stdouterr.log
|
||||
- name: Upload Metrics Logs
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: metrics-${{matrix.os}}
|
||||
path: |
|
||||
./metrics/**
|
||||
35
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/pr-test.yml
vendored
Normal file
35
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/pr-test.yml
vendored
Normal file
@@ -0,0 +1,35 @@
|
||||
name: DiskANN Pull Request Build and Test
|
||||
on: [pull_request]
|
||||
jobs:
|
||||
common:
|
||||
strategy:
|
||||
fail-fast: true
|
||||
name: DiskANN Common Build Checks
|
||||
uses: ./.github/workflows/common.yml
|
||||
unit-tests:
|
||||
name: Unit tests
|
||||
uses: ./.github/workflows/unit-tests.yml
|
||||
in-mem-pq:
|
||||
name: In-Memory with PQ
|
||||
uses: ./.github/workflows/in-mem-pq.yml
|
||||
in-mem-no-pq:
|
||||
name: In-Memory without PQ
|
||||
uses: ./.github/workflows/in-mem-no-pq.yml
|
||||
disk-pq:
|
||||
name: Disk with PQ
|
||||
uses: ./.github/workflows/disk-pq.yml
|
||||
multi-sector-disk-pq:
|
||||
name: Multi-sector Disk with PQ
|
||||
uses: ./.github/workflows/multi-sector-disk-pq.yml
|
||||
labels:
|
||||
name: Labels
|
||||
uses: ./.github/workflows/labels.yml
|
||||
dynamic:
|
||||
name: Dynamic
|
||||
uses: ./.github/workflows/dynamic.yml
|
||||
dynamic-labels:
|
||||
name: Dynamic Labels
|
||||
uses: ./.github/workflows/dynamic-labels.yml
|
||||
python:
|
||||
name: Python
|
||||
uses: ./.github/workflows/build-python.yml
|
||||
50
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/push-test.yml
vendored
Normal file
50
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/push-test.yml
vendored
Normal file
@@ -0,0 +1,50 @@
|
||||
name: DiskANN Push Build
|
||||
on: [push]
|
||||
jobs:
|
||||
common:
|
||||
strategy:
|
||||
fail-fast: true
|
||||
name: DiskANN Common Build Checks
|
||||
uses: ./.github/workflows/common.yml
|
||||
build-documentation:
|
||||
permissions:
|
||||
contents: write
|
||||
strategy:
|
||||
fail-fast: true
|
||||
name: DiskANN Build Documentation
|
||||
uses: ./.github/workflows/build-python-pdoc.yml
|
||||
build:
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ ubuntu-latest, windows-2019, windows-latest ]
|
||||
name: Build for ${{matrix.os}}
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: Build diskannpy dependency tree
|
||||
run: |
|
||||
pip install diskannpy pipdeptree
|
||||
echo "dependencies" > dependencies_${{ matrix.os }}.txt
|
||||
pipdeptree >> dependencies_${{ matrix.os }}.txt
|
||||
- name: Archive diskannpy dependencies artifact
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: dependencies_${{ matrix.os }}
|
||||
path: |
|
||||
dependencies_${{ matrix.os }}.txt
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
43
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/python-release.yml
vendored
Normal file
43
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/python-release.yml
vendored
Normal file
@@ -0,0 +1,43 @@
|
||||
name: Build and Release Python Wheels
|
||||
on:
|
||||
release:
|
||||
types: [published]
|
||||
jobs:
|
||||
python-release-wheels:
|
||||
name: Python
|
||||
uses: ./.github/workflows/build-python.yml
|
||||
build-documentation:
|
||||
strategy:
|
||||
fail-fast: true
|
||||
name: DiskANN Build Documentation
|
||||
uses: ./.github/workflows/build-python-pdoc.yml
|
||||
release:
|
||||
permissions:
|
||||
contents: write
|
||||
runs-on: ubuntu-latest
|
||||
needs: python-release-wheels
|
||||
steps:
|
||||
- uses: actions/download-artifact@v3
|
||||
with:
|
||||
name: wheels
|
||||
path: dist/
|
||||
- name: Generate SHA256 files for each wheel
|
||||
run: |
|
||||
sha256sum dist/*.whl > checksums.txt
|
||||
cat checksums.txt
|
||||
- uses: actions/setup-python@v3
|
||||
- name: Install twine
|
||||
run: python -m pip install twine
|
||||
- name: Publish with twine
|
||||
env:
|
||||
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
|
||||
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
|
||||
run: |
|
||||
twine upload dist/*.whl
|
||||
- name: Update release with SHA256 and Artifacts
|
||||
uses: softprops/action-gh-release@v1
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
files: |
|
||||
dist/*.whl
|
||||
checksums.txt
|
||||
32
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/unit-tests.yml
vendored
Normal file
32
packages/leann-backend-diskann/third_party/DiskANN/.github/workflows/unit-tests.yml
vendored
Normal file
@@ -0,0 +1,32 @@
|
||||
name: Unit Tests
|
||||
on: [workflow_call]
|
||||
jobs:
|
||||
acceptance-tests-labels:
|
||||
name: Unit Tests
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, windows-2019, windows-latest]
|
||||
runs-on: ${{matrix.os}}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Linux' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
- name: Checkout repository
|
||||
if: ${{ runner.os == 'Windows' }}
|
||||
uses: actions/checkout@v3
|
||||
with:
|
||||
fetch-depth: 1
|
||||
submodules: true
|
||||
- name: DiskANN Build CLI Applications
|
||||
uses: ./.github/actions/build
|
||||
|
||||
- name: Run Unit Tests
|
||||
run: |
|
||||
cd build
|
||||
ctest -C Release
|
||||
Reference in New Issue
Block a user