Commit Graph

10 Commits

Author SHA1 Message Date
TBNilles 399acabd58 feat(model-manager): "Free GPU memory" button to unload ComfyUI models
ComfyUI caches the last model when RAM is plentiful (unified memory), so
memory doesn't drop after switching models even though models are being
swapped, not accumulated. Add a sidebar "Free GPU memory" button that
proxies ComfyUI's POST /free (unload_models + free_memory) via a new
/api/comfyui/free endpoint (COMFYUI_URL env). Verified it releases ~7GB.
README documents this plus the --disable-smart-memory auto-unload option.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 17:14:37 -04:00
TBNilles 44f7b5a697 feat(comfyuimini): add "Manage Photos" sidebar link to the Gallery
ComfyUIMini's built-in gallery is view-only. Inject a "Manage Photos"
link into its sidebar (via the shared head.ejs partial at build time, so
no fork) that points to the Model Manager's delete-capable Gallery. The
URL is built client-side from the browser host; the port is baked from
the MODEL_MANAGER_PORT build arg (default 8189).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 15:46:04 -04:00
TBNilles c9fa3fcab5 feat(model-manager): generated-photo Gallery + device-routing landing
- Gallery view: grid of generated photos from ComfyUI's output/, full-size
  lightbox, and permanent delete (with confirm). Paginated ("Load more").
- Backend: GET /api/gallery, GET /gallery/file (path-guarded image serve),
  DELETE /api/gallery (path-guarded; clear error on permission denial).
- Mount ./output read-write into model-manager so the gallery can delete.
- Device-routing landing at /start: phones -> ComfyUIMini, desktops ->
  the Gallery; ?force=mobile|desktop overrides. Ports come from the new
  /api/ui-config (COMFYUI_PORT / COMFYUIMINI_PORT env).
- Responsive tweaks so the gallery is usable if opened directly on a phone.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 15:41:46 -04:00
TBNilles 0b606721dd feat(model-manager): run container as host UID/GID
Downloads previously landed in models/ owned by root because the
container ran as root. Add `user: "${PUID:-1000}:${PGID:-1000}"` to the
model-manager service and PUID/PGID to .env.example so downloaded models
are owned by the host user. Defaults to 1000:1000.

Note: existing root-owned files under models/ and sparkyui-data/ must be
chowned once (e.g. via a one-off root container) when upgrading.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 14:55:57 -04:00
TBNilles 359043ad67 feat: add StabilityMatrix-style Model Manager service
New FastAPI container (port 8189) to download and manage models:
- Installed Models, Add/Download (CivitAI/HuggingFace/direct URL), Settings views
- Persistent SQLite storage for API keys and download history (./sparkyui-data)
- Downloads land in ./models, auto-sorted into ComfyUI's standard subfolders
- Default COMFYUI_HOST_PATH and SPARKYUI_DATA_PATH to the project root
- Wire docker-compose service, env defaults, gitignore, README docs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 06:19:44 -04:00
Evan Carmen 6fa6c5041b feat: NORMAL_VRAM + AIMDO + copy=False patch + kernel caching
Major unified memory optimization changes:

1. model_management.py: HIGH_VRAM → NORMAL_VRAM
   - GB10 unified memory: offloading to CPU doesn't save physical RAM
     (same pool), but NORMAL_VRAM allows per-layer partial loading when
     memory is tight instead of all-or-nothing OOM
   - text_encoder_offload_device() and vae_offload_device() now return
     CPU (allows ComfyUI to offload unused models)
   - intermediate_device() still returns GPU (VAE outputs must stay in
     CUDA allocator for honest memory tracking)
   - User can force HIGH_VRAM with --highvram if models fit

2. utils.py: copy=True → copy=False for tensor.to(device)
   - On GB10 unified memory, copy=True creates a full duplicate in both
     CPU and CUDA allocators simultaneously (ComfyUI issue #10896)
   - copy=False makes .to(device) a zero-copy device label change since
     both allocators draw from the same physical LPDDR5X
   - Halves model loading memory usage when --disable-mmap is set

3. Removed --disable-dynamic-vram from ComfyUI flags
   - Was preventing AIMDO (comfy_aimdo) from initializing
   - AIMDO now activates: VBAR-based page-level VRAM management at 32MB
     granularity instead of blunt .to(cpu) copies
   - Falls back to NORMAL_VRAM per-layer loading if AIMDO has issues

4. Added CUDA_CACHE_MAXSIZE=4294967296 (4GB kernel cache)
   - PTX→SASS kernel caching for sm_121 (GB10 Blackwell)
   - 3x speedup on subsequent runs reported by DGX Spark community

5. System: vm.swappiness reduced from 60 to 1
   - Swap thrashing on unified memory causes silent system freezes
   - Near-zero swappiness ensures clean OOM kills instead
2026-05-21 19:04:25 -05:00
Evan Carmen 7e4d22e41c feat: Grace-Blackwell unified memory optimization for ComfyUI
- Add model_management.py patch: detects GB10 unified memory (VRAM ≈ RAM > 0.95)
- Set HIGH_VRAM mode: no pointless CPU offloading (same physical memory pool)
- Increase maximum_vram_for_weights from 88% to 95% (8.4GB headroom on 128GB)
- Skip torch.cuda.empty_cache() on unified memory (avoids page faults)
- Return GPU for text_encoder/vae/intermediate offload devices on unified memory
- MPS excluded from unified detection (has its own SHARED state)
- Remove PYTORCH_NO_CUDA_MEMORY_CACHING env var (patch handles caching properly)
- Mount patched file as read-only volume override in docker-compose.yml
- DeepSeek review: safe and correct for DGX Spark target

Co-authored-by: DeepSeek (code review)
2026-05-20 16:01:51 -05:00
Evan Carmen 15fc70663f Add ComfyUIMini mobile-friendly UI integration
New features:
- ComfyUIMini container (Node.js Alpine, ~150MB) for mobile/tablet access
- Separate container architecture with shared Docker network
- Health checks on both services with proper dependency ordering
- Shared output volume for image gallery feature

Files added:
- comfyuimini/Dockerfile - Node.js 20 Alpine with tsx runtime
- comfyuimini/.dockerignore - Build context filtering

Files updated:
- docker-compose.yml - Added comfyuimini service, network, health checks
- .env.example - Added COMFYUIMINI_PORT and COMFYUIMINI_REF
- README.md - Architecture diagram, ComfyUIMini docs, updated credits

Access points:
- ComfyUI (Desktop): http://<host>:8188
- ComfyUIMini (Mobile): http://<host>:3000

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 23:45:13 -06:00
Evan Carmen 687ce72dd3 Add Grace-Blackwell unified memory optimizations
Key changes based on HurbaLurba's DGX Spark research:

- Remove --gpu-only flag (fights unified memory fabric)
- Add --disable-pinned-memory, --force-fp16, --dont-upcast-attention
- Add CUDA env vars for unified memory: CUDA_MANAGED_FORCE_DEVICE_ALLOC,
  PYTORCH_NO_CUDA_MEMORY_CACHING, OMP_NUM_THREADS=20
- Document unified memory architecture best practices
- Add host-level GPU optimization instructions (clock locking, vboost)
- Document SageAttention PR #297 status (merged then reverted)
- Add credits section

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 21:01:25 -06:00
Evan Carmen 1f5aeb5248 Initial commit: SparkyUI - ComfyUI for DGX Spark (Blackwell GB10)
Docker-based ComfyUI setup for NVIDIA DGX Spark ARM64 + sm_121:
- CUDA 13.0.2 base (required for compute_121 support)
- PyTorch 2.9.1+cu130 ARM64 wheels
- SageAttention compiled with TORCH_CUDA_ARCH_LIST="12.1"
- Triton/torch.compile disabled (no sm_121 support yet)
- ComfyUI-Manager auto-installed at runtime
- Configurable model/data paths via .env

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 20:28:30 -06:00