Skip to content

Releases: ROCm/TransferBench

rocm-7.2.0

21 Jan 20:44
a824bc1

Choose a tag to compare

ROCm release v7.2.0

TransferBench v1.66.01

19 Jan 16:02
9ec17d7

Choose a tag to compare

v1.66.01

Fixed

  • Adding support for TheRock
  • Fixing parsing issue when using NULL memory type
  • Fixing CUAD compilation flags when enabling NIC/MPI

Modified

  • TransferBenchCuda must now be explicitly built with via 'make TransferBenchCuda'

TransferBench v1.66.00

05 Jan 22:32
bbd72a6

Choose a tag to compare

v1.66.00

  • Adding multi-node support, via sockets or MPI. (MPI support requires compiling against an MPI implementation like OpenMPI)
    • See CHANGELOG for more details on multi-node usage, including new rank notation
  • Wildcard support
  • CSV friendly tabular output
  • Additional memory types
  • "dryrun" preset
  • Adding nicrings preset - This runs parallel transfers where NIC form rings connecting identical numbered NICS across ranks
  • p2p and a2a presets have deprecated the use of USE_FINE_GRAIN in exchange for CPU_MEM_TYPE and GPU_MEM_TYPE to allow for more expansive testing

rocm-7.1.1

26 Nov 06:33
a824bc1

Choose a tag to compare

ROCm release v7.1.1

TransferBench v1.65.00

13 Nov 05:56
3f8d00d

Choose a tag to compare

v1.65.00

Added

  • Added warp-level dispatch support via GFX_SE_TYPE environment variable
    • GFX_SE_TYPE=0 (default): Threadblock-level dispatch, each subexecutor is a threadblock
    • GFX_SE_TYPE=1: Warp-level dispatch, each subexecutor is a single warp

rocm-7.1.0

30 Oct 05:21
a824bc1

Choose a tag to compare

ROCm release v7.1.0

rocm-7.0.2

10 Oct 12:08
6bcbcf4

Choose a tag to compare

ROCm release v7.0.2

rocm-6.4.4

24 Sep 14:00
0fbfbdd

Choose a tag to compare

ROCm release v6.4.4

rocm-7.0.1

17 Sep 16:40
6bcbcf4

Choose a tag to compare

ROCm release v7.0.1

rocm-7.0.0

16 Sep 06:36
6bcbcf4

Choose a tag to compare

ROCm release v7.0.0