Add CUDA build + packaging workflow #672

mattkjames7 · 2025-09-23T16:58:41Z

Description

Added Dockerfile.cuda specifically for CUDA build and updated packaging workflows to use it.

A manual build seems to show CUDA as available, e.g.:

matt@matt-MS-7B86:/media/raid/Work/github$ocker run --rm --name mage-cuda -p 7687:7687 --gpus all -d memgraph/memgraph-mage:3.5.1-cuda --telemetry-enabled=False --log-level=TRACE
de3bd647563738c8040dcab120068c67cf78c324e9530ce438fd01fcc9f02236
matt@matt-MS-7B86:/media/raid/Work/github$ docker ps
CONTAINER ID   IMAGE                               COMMAND                  CREATED         STATUS         PORTS      NAMES
de3bd6475637   memgraph/memgraph-mage:3.5.1-cuda   "/usr/lib/memgraph/m…"   4 seconds ago   Up 3 seconds   7687/tcp   mage-cuda
0ab1328faca6   moby/buildkit:buildx-stable-1       "buildkitd --allow-i…"   3 months ago    Up 10 hours               buildx_buildkit_stupefied_jones0
matt@matt-MS-7B86:/media/raid/Work/github$ docker exec -it -u memgraph mage-cuda bash
memgraph@de3bd6475637:/$ python3
Python 3.12.3 (main, Aug 14 2025, 17:47:21) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.get_device_name(0)
'NVIDIA GeForce RTX 4070 Ti SUPER'
>>>

CUDA: 12.6
PyTorch: 2.6.0

TODO:

Fine-tune the torch-* install to use prebuilt packages for whichever version of torch we end up building with (otherwise building takes forever, as these packages seem to require building from source).
Select torch version - current version used in MAGE is 2.6.0; current stable version is 2.8.0 - recommendation: use 2.6.0 until dgl is removed.
Select which version(s) of CUDA we want to support. Current release is 13.0, but the latest supported by torch==2.8.0 is 12.9; recommendation: use version 12.6 as it is supported by all versions of torch from 2.6.0 to 2.8.0, and it still has support for older drivers (>=525 to <580) and GPUs (Pascal, Maxwell).
ROCm
Test module to see if it can use the GPU
Test multiple GPUs
Revert changes to Dockerfile.release
Install the https://pypi.org/project/memgraph-toolbox/

Pull request type

Related issues

Delete if this PR doesn't resolve any issues. Link the issue if it does.

######################################

Reviewer checklist (the reviewer checks this part)

Module/Algorithm

Documentation checklist

Add the documentation label tag
Add the bug / feature label tag
Add the milestone for which this feature is intended
- If not known, set for a later milestone
Write a release note, including added/changed clauses
- [Release note text]
Link the documentation PR here
- [Documentation PR link]

.github/workflows/reusable_package_mage.yaml

      - name: Set target dockerfile
        run: |
-          DOCKERFILE="Dockerfile.release"
+          if [[ "${{ inputs.cuda }}" == true ]]; then


sonarqubecloud · 2025-09-26T11:41:25Z

Quality Gate failed

Failed conditions
15 Security Hotspots
E Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

gitbuda · 2025-10-05T05:45:48Z

This is working (I've built the image and run it on GPUs). Merging into my branch, I think we should polish the API a bit.

added cuda build

1ca7e63

mattkjames7 self-assigned this Sep 23, 2025

mattkjames7 added the infrastructure label Sep 23, 2025

github-advanced-security bot found potential problems Sep 23, 2025

View reviewed changes

mattkjames7 added 3 commits September 25, 2025 21:12

use correct torch packages/repo/cuda

e9962a5

added missing requirement

a57b093

hacked into working

9d135e3

gitbuda mentioned this pull request Sep 26, 2025

Add sentence embeddings QM #537

Closed

19 tasks

mattkjames7 added 4 commits September 26, 2025 11:17

parallel execution in a working state

581fecf

move worker to subdirectory

16dea8d

revert changes to CPU containers

8b95ad6

line ending

f639622

mattkjames7 added this to the mage-v3.6.0 milestone Sep 26, 2025

mattkjames7 added 6 commits September 26, 2025 12:03

fixed formatting

7881969

workflow fix

2acf322

another workflow fix

1db2c03

workflow

f1ccb3d

boolean

b017a5e

boolean

438e8d9

mattkjames7 requested a review from gitbuda September 26, 2025 12:10

mattkjames7 marked this pull request as ready for review September 26, 2025 12:11

gitbuda approved these changes Oct 5, 2025

View reviewed changes

gitbuda merged commit fa79c8c into add-torch-gpu-docker-support Oct 5, 2025
16 of 19 checks passed

gitbuda deleted the add-torch-gpu-docker-support-matt branch October 5, 2025 05:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add CUDA build + packaging workflow #672

Add CUDA build + packaging workflow #672

mattkjames7 commented Sep 23, 2025 •

edited

Loading

Uh oh!

Check failure

sonarqubecloud bot commented Sep 26, 2025

Uh oh!

gitbuda commented Oct 5, 2025

Uh oh!

Uh oh!

Uh oh!

Add CUDA build + packaging workflow #672

Add CUDA build + packaging workflow #672

Conversation

mattkjames7 commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Pull request type

Related issues

Reviewer checklist (the reviewer checks this part)

Module/Algorithm

Documentation checklist

Uh oh!

Check failure

sonarqubecloud bot commented Sep 26, 2025

Quality Gate failed

Uh oh!

gitbuda commented Oct 5, 2025

Uh oh!

Uh oh!

Uh oh!

mattkjames7 commented Sep 23, 2025 •

edited

Loading