-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to update pytorch so it works with Ampere / 3XXX series GPU's? #35
Comments
Transferring to transformers module repository |
Can confirm this issue. Using a GTX 3090 I could fix inside the container by reinstalling torch with
|
Is there any progress on this, or an easy way to fix it without forking the repo and setting individual pytorch and cuda versions? I am running into the same issue with my RTX card. |
I'm also having this issue in 2024. Except I'm experiencing this issue with an RTX5000 series card. NVIDIA drivers are setup correctly and the hardware is seen and being used by the Ollama container. I see the topic is still open, is there a recommended solution/work-around? t2v-transformers | RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx |
Hello,
I just started working with Weaviate and was able to successfully run docker compose for the CPU version, however could not get the GPU version to run; link to introduction article.
When attempting to run it I get errors across the transformers, for example:
This is followed by blocks of
POST /vectors/ HTTP/1.1" 500 Internal Server Error
and at the very endgpu-newspublications-1 | {'error': [{'message': 'fail with status 500: CUDA error: no kernel image is available for execution on the device'}]}
I am able to successfully run other gpu docker containers such as
docker run -it --rm --gpus all ubuntu nvidia-smi
without issue. The important part of the output isNVIDIA-SMI 510.85.02 | Driver Version: 510.85.02 | CUDA Version: 11.6
This is running on a fresh install of ubuntu 20.04 which comes bundled with pytorch 3.8; this tells me the error lies with the pytorch in the image. It is all running on bare metal with no virtualization outside of the docker containers in the article.From what I can tell the issue is with the pytorch want's up to sm_70 however based on this article that means that it is restricted to older GPU's. In a cloud instance where you can select the GPU and use older ones like a P4 this makes sense. However in a self hosted environment this is more difficult.
Have you encountered any issues with newer GPU's and the version of pytorch that is built into the images in the tutorial? Is there a known workaround for this?
Thanks in advance!
The text was updated successfully, but these errors were encountered: