Releases: ROCm/TransferBench
Releases · ROCm/TransferBench
rocm-7.2.0
ROCm release v7.2.0
TransferBench v1.66.01
v1.66.01
Fixed
- Adding support for TheRock
- Fixing parsing issue when using NULL memory type
- Fixing CUAD compilation flags when enabling NIC/MPI
Modified
- TransferBenchCuda must now be explicitly built with via 'make TransferBenchCuda'
TransferBench v1.66.00
v1.66.00
- Adding multi-node support, via sockets or MPI. (MPI support requires compiling against an MPI implementation like OpenMPI)
- See CHANGELOG for more details on multi-node usage, including new rank notation
- Wildcard support
- CSV friendly tabular output
- Additional memory types
- "dryrun" preset
- Adding nicrings preset - This runs parallel transfers where NIC form rings connecting identical numbered NICS across ranks
- p2p and a2a presets have deprecated the use of USE_FINE_GRAIN in exchange for CPU_MEM_TYPE and GPU_MEM_TYPE to allow for more expansive testing
rocm-7.1.1
ROCm release v7.1.1
TransferBench v1.65.00
v1.65.00
Added
- Added warp-level dispatch support via GFX_SE_TYPE environment variable
- GFX_SE_TYPE=0 (default): Threadblock-level dispatch, each subexecutor is a threadblock
- GFX_SE_TYPE=1: Warp-level dispatch, each subexecutor is a single warp
rocm-7.1.0
ROCm release v7.1.0
rocm-7.0.2
ROCm release v7.0.2
rocm-6.4.4
ROCm release v6.4.4
rocm-7.0.1
ROCm release v7.0.1
rocm-7.0.0
ROCm release v7.0.0