Commit Graph

6 Commits

Author SHA1 Message Date
Evan Carmen 31939a9710 fix: revert intermediate_device to cpu for unified memory
intermediate_device() controls where large output tensors (decoded video
frames) are accumulated. On unified memory, cpu and cuda:0 share the same
physical RAM, but the CUDA allocator has different fragmentation behavior.

With intermediate_device=cuda:0, LTX video VAE decode hung because
tiled_scale_multidim allocates the full output tensor on cuda:0 upfront,
and the CUDA allocator can't efficiently reclaim space during tiled
decode. Reverting to cpu fixes the hang.

vae_offload_device() and text_encoder_offload_device() remain cuda:0
since those model-loading paths benefit from GPU allocation.
2026-05-20 19:30:53 -05:00
Evan Carmen 7e4d22e41c feat: Grace-Blackwell unified memory optimization for ComfyUI
- Add model_management.py patch: detects GB10 unified memory (VRAM ≈ RAM > 0.95)
- Set HIGH_VRAM mode: no pointless CPU offloading (same physical memory pool)
- Increase maximum_vram_for_weights from 88% to 95% (8.4GB headroom on 128GB)
- Skip torch.cuda.empty_cache() on unified memory (avoids page faults)
- Return GPU for text_encoder/vae/intermediate offload devices on unified memory
- MPS excluded from unified detection (has its own SHARED state)
- Remove PYTORCH_NO_CUDA_MEMORY_CACHING env var (patch handles caching properly)
- Mount patched file as read-only volume override in docker-compose.yml
- DeepSeek review: safe and correct for DGX Spark target

Co-authored-by: DeepSeek (code review)
2026-05-20 16:01:51 -05:00
Evan Carmen 15fc70663f Add ComfyUIMini mobile-friendly UI integration
New features:
- ComfyUIMini container (Node.js Alpine, ~150MB) for mobile/tablet access
- Separate container architecture with shared Docker network
- Health checks on both services with proper dependency ordering
- Shared output volume for image gallery feature

Files added:
- comfyuimini/Dockerfile - Node.js 20 Alpine with tsx runtime
- comfyuimini/.dockerignore - Build context filtering

Files updated:
- docker-compose.yml - Added comfyuimini service, network, health checks
- .env.example - Added COMFYUIMINI_PORT and COMFYUIMINI_REF
- README.md - Architecture diagram, ComfyUIMini docs, updated credits

Access points:
- ComfyUI (Desktop): http://<host>:8188
- ComfyUIMini (Mobile): http://<host>:3000

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 23:45:13 -06:00
Evan Carmen 434a90741c Add What's Included section, fix repo URL
- Add "What's Included" section listing ComfyUI, ComfyUI-Manager,
  SageAttention, and PyTorch versions
- Update clone URL to actual GitHub repo (ecarmen16/SparkyUI)
- ComfyUI-Manager is auto-installed on first container run

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 21:27:42 -06:00
Evan Carmen 687ce72dd3 Add Grace-Blackwell unified memory optimizations
Key changes based on HurbaLurba's DGX Spark research:

- Remove --gpu-only flag (fights unified memory fabric)
- Add --disable-pinned-memory, --force-fp16, --dont-upcast-attention
- Add CUDA env vars for unified memory: CUDA_MANAGED_FORCE_DEVICE_ALLOC,
  PYTORCH_NO_CUDA_MEMORY_CACHING, OMP_NUM_THREADS=20
- Document unified memory architecture best practices
- Add host-level GPU optimization instructions (clock locking, vboost)
- Document SageAttention PR #297 status (merged then reverted)
- Add credits section

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 21:01:25 -06:00
Evan Carmen 1f5aeb5248 Initial commit: SparkyUI - ComfyUI for DGX Spark (Blackwell GB10)
Docker-based ComfyUI setup for NVIDIA DGX Spark ARM64 + sm_121:
- CUDA 13.0.2 base (required for compute_121 support)
- PyTorch 2.9.1+cu130 ARM64 wheels
- SageAttention compiled with TORCH_CUDA_ARCH_LIST="12.1"
- Triton/torch.compile disabled (no sm_121 support yet)
- ComfyUI-Manager auto-installed at runtime
- Configurable model/data paths via .env

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 20:28:30 -06:00