fix: intermediate_device() returns cuda on unified memory

On Grace-Blackwell (GB10), CPU and GPU share the same physical RAM. intermediate_device() was returning 'cpu', which means ComfyUI allocates output buffers (like VAE decode) through the CPU allocator on the same physical memory pool it thinks is free VRAM. This causes: 1. Memory accounting mismatch — ComfyUI thinks intermediates are 'over there' on CPU and overestimates available VRAM 2. Unnecessary .to(device) copies through separate allocator heaps 3. Heap fragmentation across the unified memory pool Now matches text_encoder_offload_device() and vae_offload_device() which already return get_torch_device() on UNIFIED_MEMORY.
2026-05-21 11:02:06 -05:00
parent 31939a9710
commit c803ea6146
1 changed files with 1 additions and 1 deletions
@@ -1106,7 +1106,7 @@ def text_encoder_dtype(device=None):


 def intermediate_device():
-    if args.gpu_only:
+    if args.gpu_only or UNIFIED_MEMORY:
        return get_torch_device()
    else:
        return torch.device("cpu")