hip_torch_nl

PyTorch C++ extension for GPU-accelerated neighbor list computation on AMD hardware (HIP/ROCm). Unlike a standalone HIP library, this extension shares PyTorch's HIP runtime, allocator, and stream, so the neighbor list can be composed with other PyTorch GPU ops in the same process without context conflicts.

Requirements

ROCm 5.0+ (tested with 6.0 and 7.0)
PyTorch built with ROCm support
A C++17 toolchain and hipcc on PATH (provided by ROCm)

Installation

# Tell HIP which GPU architecture to target (example: gfx1102 / RX 7600)
export HSA_OVERRIDE_GFX_VERSION=11.0.0

pip install -e .

The build relies on PyTorch's CUDAExtension machinery, which routes through hipcc when ROCm is detected. setup.py looks for /opt/rocm, /opt/rocm-7.0.1, and /opt/rocm-6.0.0 and points CUDA_HOME at whichever exists.

Usage

import torch
from hip_torch_nl import hip_torch_nl, HIP_TORCH_NL_AVAILABLE

assert HIP_TORCH_NL_AVAILABLE, "extension not built"

positions = torch.rand(1000, 3, device="cuda") * 10.0
cell = torch.eye(3, device="cuda") * 10.0
pbc = torch.tensor([True, True, True], device="cuda")
cutoff = torch.tensor(3.0)

mapping, shifts = hip_torch_nl(positions, cell, pbc, cutoff)
# mapping: (2, n_pairs) int64 — directed pairs (i, j) with i != j or shift != 0
# shifts:  (n_pairs, 3)         — integer cell shifts S such that
#                                 D = pos[j] - pos[i] + S @ cell

API

hip_torch_nl(
    positions, cell, pbc, cutoff,
    sort_id=False,
    compatible_mode=True,
    algorithm="auto",
)

positions: (n_atoms, 3) float tensor on GPU.
cell: (3, 3) row-vector cell matrix on GPU (row i is the i-th cell vector).
pbc: (3,) bool tensor on GPU.
cutoff: scalar tensor or float.
sort_id: if True, sort the returned pairs by their first index.
compatible_mode: if True (default), filter results to exactly match torch_sim.neighbors.standard_nl. Set to False for the raw kernel output (and to remove the torch_sim dependency at call time).
algorithm: "auto" (default), "direct"/"v1", or "cell_list"/"v2". "auto" picks cell_list above 15 000 atoms.

The convenience wrappers hip_torch_nl_v1 and hip_torch_nl_v2 force the respective algorithm.

Algorithms

Variant	Complexity	Memory	Practical limit (8 GB VRAM)
V1 (`direct`)	O(n²) brute force, MIC	high — pairs buffer scales with n²	~16k atoms
V2 (`cell_list`)	O(n) cell list, MIC	int32 pair indices, density-based estimation	~37k atoms

Both algorithms apply the minimum image convention. They produce identical pair sets when cutoff is smaller than half the smallest cell height; above that, MIC is not appropriate for either implementation.

Repository layout

hip_torch_nl/
├── __init__.py                     # Python interface
├── csrc/
│   ├── hip_neighborlist.cpp        # pybind11 bindings + algorithm dispatch
│   └── hip_neighborlist_kernel.cu  # HIP kernels (V1 brute force, V2 cell list)
tests/                              # pytest correctness suite
benchmarks/run_benchmarks.py        # timing harness

Tests

pip install -e ".[test]"
pytest

The suite verifies output against a vectorized brute-force MIC reference implemented in tests/conftest.py. It covers:

random positions under full, partial, and zero PBC
V1 vs V2 agreement
FCC nearest-neighbor coordination number (12)
pair-list symmetry, sort_id, and dtype preservation
input validation (CPU tensors, bad shapes, missing extension)

Tests skip cleanly if the extension is not built or no HIP/CUDA device is available.

Benchmarks

python -m benchmarks.run_benchmarks --sizes 1000 4000 16000
python -m benchmarks.run_benchmarks --include-reference  # also time standard_nl

The script sweeps system sizes, runs V1, V2, and the auto selector, and reports median and best-of wall-clock time. Pass --include-reference to also time torch_sim.neighbors.standard_nl on CPU as a baseline. Use --cutoff and --density to control the test geometry; the cutoff must stay below half the cubic box height (the script enforces this).

License

BSD-3-Clause

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmarks		benchmarks
hip_torch_nl		hip_torch_nl
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hip_torch_nl

Requirements

Installation

Usage

API

Algorithms

Repository layout

Tests

Benchmarks

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hip_torch_nl

Requirements

Installation

Usage

API

Algorithms

Repository layout

Tests

Benchmarks

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages