ComfyUI dont allocate the the memory in PyTorch #6475

OddTama · 2025-01-15T15:54:39Z

Expected Behavior

Executes

Actual Behavior

KSampler
HIP out of memory. Tried to allocate 6.46 GiB. GPU 0 has a total capacity of 23.92 GiB of which 3.35 GiB is free. Of the allocated memory 16.97 GiB is allocated by PyTorch, and 127.91 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Steps to Reproduce

Execute a normal workflow

Debug Logs

# ComfyUI Error Report
## Error Details
- **Node ID:** 3
- **Node Type:** KSampler
- **Exception Type:** torch.OutOfMemoryError
- **Exception Message:** HIP out of memory. Tried to allocate 6.46 GiB. GPU 0 has a total capacity of 23.92 GiB of which 3.35 GiB is free. Of the allocated memory 16.97 GiB is allocated by PyTorch, and 127.91 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
## Stack Trace

  File "/home/ad/ComfyUI/execution.py", line 327, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "/home/ad/ComfyUI/execution.py", line 202, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "/home/ad/ComfyUI/execution.py", line 174, in _map_node_over_list
    process_inputs(input_dict, i)

  File "/home/ad/ComfyUI/execution.py", line 163, in process_inputs
    results.append(getattr(obj, func)(**inputs))

  File "/home/ad/ComfyUI/nodes.py", line 1533, in sample
    return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)

  File "/home/ad/ComfyUI/nodes.py", line 1500, in common_ksampler
    samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,

  File "/home/ad/ComfyUI/comfy/sample.py", line 45, in sample
    samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 1110, in sample
    return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 1000, in sample
    return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 985, in sample
    output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 953, in outer_sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 936, in inner_sample
    samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 715, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/k_diffusion/sampling.py", line 161, in sample_euler
    denoised = model(x, sigma_hat * s_in, **extra_args)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 380, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 916, in __call__
    return self.predict_noise(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 919, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 360, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 196, in calc_cond_batch
    return executor.execute(model, conds, x_in, timestep, model_options)

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 309, in _calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)

  File "/home/ad/ComfyUI/comfy/model_base.py", line 131, in apply_model
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/model_base.py", line 160, in _apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 831, in forward
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 873, in _forward
    h = forward_timestep_embed(module, h, emb, context, transformer_options, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator)

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2380, in silu
    return torch._C._nn.silu(input)

System Information

ComfyUI Version: 0.3.10
Arguments: main.py
OS: posix
Python Version: 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]
Embedded Python: false
PyTorch Version: 2.7.0.dev20250115+rocm6.2.4

Devices

Name: cuda:0 AMD Radeon RX 7900 XTX : native
- Type: cuda
- VRAM Total: 25681895424
- VRAM Free: 20495405056
- Torch VRAM Total: 33554432
- Torch VRAM Free: 0



### Other

_No response_

The text was updated successfully, but these errors were encountered:

Frozen-byte · 2025-01-16T12:16:40Z

I have this problem with the latest master, too.
A known working flux.dev workflow has not changed.
This error only occurs when I add the --gpu-only flag.

Statrup:

# use correct GFX version
export HSA_OVERRIDE_GFX_VERSION=10.3.0
export PYTORCH_ROCM_ARCH=gfx1030

# from https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations#memory--performance-impact-of-optimizers-and-flags
export PYTORCH_HIP_ALLOC_CONF="garbage_collection_threshold:0.6,max_split_size_mb:128"
export PYTORCH_CUDA_ALLOC_CONF="garbage_collection_threshold:0.6,max_split_size_mb:128"

python3 main.py --use-pytorch-cross-attention --gpu-only

Output:

Total VRAM 16368 MB, total RAM 31221 MB
pytorch version: 2.7.0.dev20250112+rocm6.3
Set vram state to: HIGH_VRAM
Device: cuda:0 AMD Radeon RX 6900 XT : native
### ComfyUI Version: v0.3.10-70-g88ceb28 | Released on '2025-01-16'
.
.
.
VAE load device: cuda:0, offload device: cuda:0, dtype: torch.float32
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLUX
!!! Exception during processing !!! HIP out of memory. Tried to allocate 40.00 MiB. GPU 0 has a total capacity of 15.98 GiB of which 16.00 MiB is free. Of the allocated memory 15.76 GiB is allocated by PyTorch, and 3.43 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

OddTama added the Potential Bug User is reporting a bug. This should be tested. label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ComfyUI dont allocate the the memory in PyTorch #6475

ComfyUI dont allocate the the memory in PyTorch #6475

OddTama commented Jan 15, 2025

Frozen-byte commented Jan 16, 2025 •

edited

Loading

ComfyUI dont allocate the the memory in PyTorch #6475

ComfyUI dont allocate the the memory in PyTorch #6475

Comments

OddTama commented Jan 15, 2025

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

System Information

Devices

Frozen-byte commented Jan 16, 2025 • edited Loading

Frozen-byte commented Jan 16, 2025 •

edited

Loading