feat: Grace-Blackwell unified memory optimization for ComfyUI

- Add model_management.py patch: detects GB10 unified memory (VRAM ≈ RAM > 0.95)
- Set HIGH_VRAM mode: no pointless CPU offloading (same physical memory pool)
- Increase maximum_vram_for_weights from 88% to 95% (8.4GB headroom on 128GB)
- Skip torch.cuda.empty_cache() on unified memory (avoids page faults)
- Return GPU for text_encoder/vae/intermediate offload devices on unified memory
- MPS excluded from unified detection (has its own SHARED state)
- Remove PYTORCH_NO_CUDA_MEMORY_CACHING env var (patch handles caching properly)
- Mount patched file as read-only volume override in docker-compose.yml
- DeepSeek review: safe and correct for DGX Spark target

Co-authored-by: DeepSeek (code review)

This commit is contained in:

Evan Carmen

2026-05-20 16:01:51 -05:00

parent 15fc70663f

commit 7e4d22e41c

3 changed files with 2167 additions and 2 deletions

patches/model_management.py

+1908

View File

File diff suppressed because it is too large Load Diff