Run GPULlama3.java on NVIDIA GPUs using Docker with either OpenCL or PTX support.
- NVIDIA GPU with compatible drivers
- NVIDIA Container Toolkit installed
👉 Install Guide
OpenCL:
docker pull beehivelab/gpullama3.java-nvidia-openjdk-opencl:latest
PTX:
docker pull beehivelab/gpullama3.java-nvidia-openjdk-ptx:latest
git clone https://github.com/beehive-lab/docker-gpullama3.java
cd docker-gpullama3.java
Download a model from huggingface, as shown in GPULlama3.java. For example:
wget https://huggingface.co/beehive-lab/Llama-3.2-1B-Instruct-GGUF-FP16/resolve/main/beehive-llama-3.2-1b-instruct-fp16.gguf
OpenCL Runner:
./dockerized-llama-tornado-nvidia-opencl \
--gpu --opencl --verbose-init \
--model beehive-llama-3.2-1b-instruct-fp16.gguf \
--prompt "tell me a joke"
PTX Runner:
./dockerized-llama-tornado-nvidia-ptx \
--gpu --ptx --verbose-init \
--model beehive-llama-3.2-1b-instruct-fp16.gguf \
--prompt "tell me a joke"
Sample Output:
Here's one: What do you call a fake noodle? An impasta!
To build the Docker images locally, use the provided build.sh
script:
./build.sh
Build for NVIDIA GPUs using one of the following flags:
--nvidia-jdk21-ocl
→ Build image with OpenCL support and JDK 21--nvidia-jdk21-ptx
→ Build image with PTX support and JDK 21
Example:
./build.sh --nvidia-jdk21-ocl
This will create a Docker image ready to run GPULlama3.java
on NVIDIA GPUs with OpenCL.
This project is developed at The University of Manchester, and it is open-source under the Apache 2 license.