Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUBLAS_STATUS_NOT_SUPPORTED when calling cublasStrsm #73

Open
traugdor opened this issue Feb 17, 2025 · 6 comments
Open

CUBLAS_STATUS_NOT_SUPPORTED when calling cublasStrsm #73

traugdor opened this issue Feb 17, 2025 · 6 comments
Assignees
Labels
implementation Unimplemented feature(s)

Comments

@traugdor
Copy link

  File "D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Riffusion\nodes.py", line 58, in waveform_from_spectrogram
    Sxx_torch = mel_inv_scaler(Sxx_torch)
                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\venv\Lib\site-packages\torchaudio\transforms\_transforms.py", line 498, in forward
    specgram = torch.relu(torch.linalg.lstsq(self.fb.transpose(-1, -2)[None], melspec, driver=self.driver).solution)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasStrsm( handle, side, uplo, trans, diag, m, n, alpha, A, lda, B, ldb)`

here is the code that builds the call to InverseMelScale

    def waveform_from_spectrogram(self, Sxx: np.ndarray, n_fft: int, hop_length: int, win_length: int, num_samples: int, 
        sample_rate: int, mel_scale: bool = True, nmels: int = 512, max_mel_iters: int = 200, num_griffin_lim_iters: int = 32,
        device: str = platform.system() == "Darwin" and "cpu" or "cuda"
    ) -> np.ndarray:
        Sxx_torch = torch.from_numpy(Sxx).to(device)
        
        if mel_scale:
            mel_inv_scaler = torchaudio.transforms.InverseMelScale(n_mels=nmels, sample_rate=sample_rate, f_min=0, f_max=10000,
                n_stft=n_fft // 2 + 1, norm=None, mel_scale="htk").to(device)
            Sxx_torch = mel_inv_scaler(Sxx_torch)

versions:

  • Windows: 10
  • torch==2.6.0+cu118
  • torchaudio==2.6.0+cu118

This is after doing the following:

  • upgraded PyTorch to 2.6.0 cu118
  • replacing the 3 dll files of official CUDA (cublas, cusparse, and nvrtc) and comfyUI-Zluda has the torch backends already disabled as recommended in the README for Zluda
  • set $env:DISABLE_ADDM_CUDA_LT=1 in powershell run script before invoking zluda.exe
@traugdor
Copy link
Author

Should I also replace

  • cudart
  • cufft
  • cufftw

I'm not sure if including these 3 dlls into the official PyTorch libraries in my venv would fix the issue...

@lshqqytiger lshqqytiger self-assigned this Feb 18, 2025
@lshqqytiger lshqqytiger added the implementation Unimplemented feature(s) label Feb 18, 2025
@lshqqytiger
Copy link
Owner

lshqqytiger commented Feb 18, 2025

Thank you for report. Try this build.
cublas_dev.zip

Should I also replace

  • cudart
  • cufft
  • cufftw

cudart is not needed, but sometimes fft dlls are used by PyTorch.

@traugdor
Copy link
Author

I copied this into my zluda directory and into the torch directory and now I get this error:

rocBLAS error from hip error code: 'hipErrorInvalidDeviceFunction':98
!!! Exception during processing !!! CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasStrsm( handle, side, uplo, trans, diag, m, n, alpha, A, lda, B, ldb)`
Traceback (most recent call last):
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 327, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 202, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 174, in _map_node_over_list
    process_inputs(input_dict, i)
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 163, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\custom_nodes\riffusion\nodes.py", line 149, in Process_Riffusion
    audio, duration = self.get_wave_bytes_from_spectrogram(spec)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\custom_nodes\riffusion\nodes.py", line 93, in get_wave_bytes_from_spectrogram
    samples = self.waveform_from_spectrogram(Sxx=Sxx, n_fft=n_fft, hop_length=hop_length, win_length=win_length,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\custom_nodes\riffusion\nodes.py", line 58, in waveform_from_spectrogram
    Sxx_torch = mel_inv_scaler(Sxx_torch)
                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Stable Diffusion\ComfyUI\venv\Lib\site-packages\torchaudio\transforms\_transforms.py", line 498, in forward
    specgram = torch.relu(torch.linalg.lstsq(self.fb.transpose(-1, -2)[None], melspec, driver=self.driver).solution)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasStrsm( handle, side, uplo, trans, diag, m, n, alpha, A, lda, B, ldb)`

For reference I am running on a RX 6600XT

@lshqqytiger
Copy link
Owner

Could you try this?
dev.zip

@traugdor
Copy link
Author

traugdor commented Feb 18, 2025

Hi Sorry for the late reply. There are two extra files in this build. What files do I copy from this into torch?

I'm guessing I need at least:

  • cublas.dll
  • cublasLt.dll
  • cudnn.dll

However I would like to be sure before I accidentally break my torch install in my venv.

@lshqqytiger
Copy link
Owner

lshqqytiger commented Feb 19, 2025

cublasLt and cudnn are excluded by default because they depend on ROCm components, which are not included in official Windows HIP SDK releases.

You'll need

  • cublas
  • cusparse
  • nvrtc
  • cufft, cufftw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implementation Unimplemented feature(s)
Projects
None yet
Development

No branches or pull requests

2 participants