Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: operation not supported when calling conv2d #58

Closed
annalasko opened this issue Jan 6, 2025 · 2 comments
Closed

Comments

@annalasko
Copy link

I am trying to use ZLUDA in order to run PyTorch Real-ESRGAN. Upon running main.py:

import os
import torch
from PIL import Image
import numpy as np
from RealESRGAN import RealESRGAN

def main() -> int:
    cuda = torch.cuda.is_available()
    print(cuda)
    device = torch.device('cuda' if cuda else 'cpu')
    model = RealESRGAN(device, scale=4)
    model.load_weights('weights/RealESRGAN_x4.pth', download=True)
    for i, image in enumerate(os.listdir("inputs")):
        image = Image.open(f"inputs/{image}").convert('RGB')
        sr_image = model.predict(image)
        sr_image.save(f'results/{i}.png')

if __name__ == '__main__':
    torch.backends.cudnn.enabled = False
    torch.backends.cuda.enable_flash_sdp(False)
    torch.backends.cuda.enable_math_sdp(True)
    torch.backends.cuda.enable_mem_efficient_sdp(False)

    main()

inside of the following .bat file:

@ECHO off
SETLOCAL

CALL "%~dp0Scripts\activate.bat"

SET HIP_VISIBLE_DEVICES="0"
SET DISABLE_ADDMM_CUDA_LT=1
SET ZLUDA_COMGR_LOG_LEVEL=1

START /WAIT /B "%~dp0" zluda/zluda.exe python main.py

CALL "%~dp0Scripts\deactivate.bat"

ENDLOCAL
PAUSE /K

I am met with the following traceback:

Traceback (most recent call last):
  File "C:\devel\Python\_\Real-ESRGAN\main.py", line 24, in <module>
    main()
  File "C:\devel\Python\_\Real-ESRGAN\main.py", line 15, in main
    sr_image = model.predict(image)
               ^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\Lib\site-packages\torch\amp\autocast_mode.py", line 44, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\RealESRGAN\model.py", line 72, in predict
    res = self.model(img[0:batch_size])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\RealESRGAN\rrdbnet_arch.py", line 112, in forward
    feat = self.conv_first(feat)
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\Lib\site-packages\torch\nn\modules\conv.py", line 554, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\devel\Python\_\Real-ESRGAN\Lib\site-packages\torch\nn\modules\conv.py", line 549, in _conv_forward
    return F.conv2d(
           ^^^^^^^^^
RuntimeError: CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

I have tried replacing nvrtc.dll, cublas.dll, and cusparse.dll in torch=2.5.1+cu124 with their ZLUDA counterparts.

I am running a 6950XT.

I am running Python 3.12.7.

I have installed ROCm 6.2, and added the bin folder to my PATH.

Is this my doing, or is conv2d truly not supported in ZLUDA? If it is unsupported, is there an alternative I can implement to get the same effect?

@lshqqytiger
Copy link
Owner

lshqqytiger commented Jan 6, 2025

See #57. You'll need cu118 torch. The backtrace is almost meaningless as the origin of the error is ZLUDA libraries.

@annalasko
Copy link
Author

This fixed it, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants