fix: intermediate_device() returns cuda on unified memory
On Grace-Blackwell (GB10), CPU and GPU share the same physical RAM. intermediate_device() was returning 'cpu', which means ComfyUI allocates output buffers (like VAE decode) through the CPU allocator on the same physical memory pool it thinks is free VRAM. This causes: 1. Memory accounting mismatch — ComfyUI thinks intermediates are 'over there' on CPU and overestimates available VRAM 2. Unnecessary .to(device) copies through separate allocator heaps 3. Heap fragmentation across the unified memory pool Now matches text_encoder_offload_device() and vae_offload_device() which already return get_torch_device() on UNIFIED_MEMORY.
This commit is contained in:
@@ -1106,7 +1106,7 @@ def text_encoder_dtype(device=None):
|
||||
|
||||
|
||||
def intermediate_device():
|
||||
if args.gpu_only:
|
||||
if args.gpu_only or UNIFIED_MEMORY:
|
||||
return get_torch_device()
|
||||
else:
|
||||
return torch.device("cpu")
|
||||
|
||||
Reference in New Issue
Block a user