rocQuantum is a clean-room, open-source implementation of NVIDIA's cuQuantum SDK for AMD GPUs running the ROCm/HIP stack. It provides production-grade primitives for high-performance quantum-circuit simulation, tensor-network contraction, open-quantum-system dynamics, Pauli-string propagation, and stabilizer formalisms on Instinct-class accelerators (MI100 / MI200 / MI300).
rocQuantum is not affiliated with or endorsed by NVIDIA. The
cu*->roc*API surface is reimplemented from scratch from publicly-documented behaviour. No proprietary headers or binaries are copied.
| rocQuantum library | cuQuantum analogue | Purpose |
|---|---|---|
librocstatevec |
libcustatevec |
Dense state-vector simulation (single- + multi-GPU) |
libroctensornet |
libcutensornet |
Tensor-network construction, pathfinding, contraction |
librocdensitymat |
libcudensitymat |
Density-matrix / open-quantum-systems dynamics |
librocpauliprop |
libcupauliprop |
Heisenberg-picture Pauli-string propagation |
librocstabilizer |
libcustabilizer |
Stabilizer-formalism evolution and QEC decoders |
All five share a uniform handle/workspace lifecycle, async stream-based
execution (hipStream_t), versioned headers under include/, and
versioned shared-object sonames for ABI stability.
rocQuantum builds on the existing ROCm math stack rather than reinventing dense linear algebra or collectives:
| Purpose | rocQuantum uses | cuQuantum analogue |
|---|---|---|
| Runtime / device API | hip |
cuda_runtime |
| Dense BLAS | rocBLAS, hipBLAS |
cuBLAS |
| Dense solvers / SVD | rocSOLVER, hipSOLVER |
cuSOLVER |
| Sparse BLAS | rocSPARSE |
cuSPARSE |
| Tensor contraction | hipTENSOR |
cuTENSOR |
| FFT | rocFFT, hipFFT |
cuFFT |
| Random numbers | rocRAND |
cuRAND |
| Primitives / sort / scan | rocPRIM, hipCUB |
CUB |
| High-level templates | rocThrust |
Thrust |
| Multi-GPU collectives | RCCL |
NCCL |
| Runtime JIT | hipRTC |
NVRTC |
- Datacenter:
gfx908(MI100),gfx90a(MI200/MI250),gfx940/gfx941/gfx942(MI300A/X) - Workstation (dev only):
gfx1100/gfx1101(RDNA3)
git clone https://github.com/ROCm/rocQuantum.git
cd rocQuantum
cmake -S . -B build \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_HIP_ARCHITECTURES="gfx90a;gfx942" \
-DROCQUANTUM_BUILD_TESTS=ON \
-DROCQUANTUM_BUILD_SAMPLES=ON
cmake --build build -j
sudo cmake --install buildThe Python package is built independently:
cd python
pip install --no-build-isolation -e .A reference Docker image is provided:
docker build -f docker/Dockerfile.rocm6 -t rocquantum:rocm6 .rocQuantum/
CMakeLists.txt # superbuild driver
cmake/ # FindXxx finders + helpers
libs/ # the five native libraries
rocstatevec/
roctensornet/
rocdensitymat/
rocpauliprop/
rocstabilizer/
python/ # Cython bindings + high-level Python API
samples/ # C/C++/HIP samples (one folder per library)
benchmarks/ # roc-quantum-benchmarks Python package
extra/ # MPI comm-plugin, wheel-build helpers
docs/ # Sphinx documentation site
docker/ # ROCm 6 / ROCm 7 build containers
ci/ # self-hosted-runner job scripts
rocQuantum is released under the BSD 3-Clause license. See
LICENSE. Each library publishes its own LICENSE file in
its subdirectory; all are BSD-3-Clause unless explicitly noted.
This project is a clean-room reimplementation: no header, source, or binary from the proprietary NVIDIA cuQuantum SDK is incorporated. The API shape is reproduced based on the publicly-published documentation at https://docs.nvidia.com/cuda/cuquantum/latest/ for interoperability purposes only.
If you use rocQuantum in academic work, please cite us using the
metadata in CITATION.cff. For a comparison
benchmark, please additionally cite the upstream cuQuantum work by
H. Bayraktar et al., QCE 2023, doi:10.1109/QCE57702.2023.00119.