Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ComfyUI dont allocate the the memory in PyTorch #6475

Open
OddTama opened this issue Jan 15, 2025 · 1 comment
Open

ComfyUI dont allocate the the memory in PyTorch #6475

OddTama opened this issue Jan 15, 2025 · 1 comment
Labels
Potential Bug User is reporting a bug. This should be tested.

Comments

@OddTama
Copy link

OddTama commented Jan 15, 2025

Expected Behavior

Executes

Actual Behavior

KSampler
HIP out of memory. Tried to allocate 6.46 GiB. GPU 0 has a total capacity of 23.92 GiB of which 3.35 GiB is free. Of the allocated memory 16.97 GiB is allocated by PyTorch, and 127.91 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Steps to Reproduce

Execute a normal workflow

Debug Logs

# ComfyUI Error Report
## Error Details
- **Node ID:** 3
- **Node Type:** KSampler
- **Exception Type:** torch.OutOfMemoryError
- **Exception Message:** HIP out of memory. Tried to allocate 6.46 GiB. GPU 0 has a total capacity of 23.92 GiB of which 3.35 GiB is free. Of the allocated memory 16.97 GiB is allocated by PyTorch, and 127.91 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
## Stack Trace

  File "/home/ad/ComfyUI/execution.py", line 327, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "/home/ad/ComfyUI/execution.py", line 202, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "/home/ad/ComfyUI/execution.py", line 174, in _map_node_over_list
    process_inputs(input_dict, i)

  File "/home/ad/ComfyUI/execution.py", line 163, in process_inputs
    results.append(getattr(obj, func)(**inputs))

  File "/home/ad/ComfyUI/nodes.py", line 1533, in sample
    return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)

  File "/home/ad/ComfyUI/nodes.py", line 1500, in common_ksampler
    samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,

  File "/home/ad/ComfyUI/comfy/sample.py", line 45, in sample
    samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 1110, in sample
    return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 1000, in sample
    return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 985, in sample
    output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 953, in outer_sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 936, in inner_sample
    samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 715, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/k_diffusion/sampling.py", line 161, in sample_euler
    denoised = model(x, sigma_hat * s_in, **extra_args)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 380, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 916, in __call__
    return self.predict_noise(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 919, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 360, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 196, in calc_cond_batch
    return executor.execute(model, conds, x_in, timestep, model_options)

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/samplers.py", line 309, in _calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)

  File "/home/ad/ComfyUI/comfy/model_base.py", line 131, in apply_model
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/model_base.py", line 160, in _apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 831, in forward
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(

  File "/home/ad/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)

  File "/home/ad/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 873, in _forward
    h = forward_timestep_embed(module, h, emb, context, transformer_options, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator)

  File "/home/ad/ComfyUI/.venv/lib/python3.10/site-packages/torch/nn/functional.py", line 2380, in silu
    return torch._C._nn.silu(input)

System Information

  • ComfyUI Version: 0.3.10
  • Arguments: main.py
  • OS: posix
  • Python Version: 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0]
  • Embedded Python: false
  • PyTorch Version: 2.7.0.dev20250115+rocm6.2.4

Devices

  • Name: cuda:0 AMD Radeon RX 7900 XTX : native
    • Type: cuda
    • VRAM Total: 25681895424
    • VRAM Free: 20495405056
    • Torch VRAM Total: 33554432
    • Torch VRAM Free: 0


### Other

_No response_
@OddTama OddTama added the Potential Bug User is reporting a bug. This should be tested. label Jan 15, 2025
@Frozen-byte
Copy link

Frozen-byte commented Jan 16, 2025

I have this problem with the latest master, too.
A known working flux.dev workflow has not changed.
This error only occurs when I add the --gpu-only flag.

Statrup:

# use correct GFX version
export HSA_OVERRIDE_GFX_VERSION=10.3.0
export PYTORCH_ROCM_ARCH=gfx1030

# from https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations#memory--performance-impact-of-optimizers-and-flags
export PYTORCH_HIP_ALLOC_CONF="garbage_collection_threshold:0.6,max_split_size_mb:128"
export PYTORCH_CUDA_ALLOC_CONF="garbage_collection_threshold:0.6,max_split_size_mb:128"

python3 main.py --use-pytorch-cross-attention --gpu-only

Output:

Total VRAM 16368 MB, total RAM 31221 MB
pytorch version: 2.7.0.dev20250112+rocm6.3
Set vram state to: HIGH_VRAM
Device: cuda:0 AMD Radeon RX 6900 XT : native
### ComfyUI Version: v0.3.10-70-g88ceb28 | Released on '2025-01-16'
.
.
.
VAE load device: cuda:0, offload device: cuda:0, dtype: torch.float32
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLUX
!!! Exception during processing !!! HIP out of memory. Tried to allocate 40.00 MiB. GPU 0 has a total capacity of 15.98 GiB of which 16.00 MiB is free. Of the allocated memory 15.76 GiB is allocated by PyTorch, and 3.43 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_HIP_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Potential Bug User is reporting a bug. This should be tested.
Projects
None yet
Development

No branches or pull requests

2 participants