Problem Description
I install rDP and do tracing example follow the README.md. But it run Aborted(failed)
root@tw024:/ws/Try_rPD# runTracer.sh python matmult_gpu.py
Creating empty rpd: trace.rpd
rpd_tracer, because
Shape of input data matrix: [1000, 500], weight matrix: [500, 500], result matrix:torch.Size([1000, 500])
tensor([[ 31.2559, -5.9614, -7.5495, ..., 4.9965, 13.3129, -22.1125],
[ -8.3562, -23.1422, -7.1189, ..., -30.3476, 8.9711, -43.7970],
[ 8.6492, 2.2358, -10.6567, ..., 21.0161, -46.0028, -26.3684],
...,
[-18.7425, -26.5550, -22.3633, ..., 21.0699, 33.3842, -24.6637],
[-37.9485, 16.3621, -19.1744, ..., -0.9327, 1.9820, -13.6000],
[ 11.3354, 22.0743, 20.7730, ..., -0.6945, -12.1807, -11.0098]],
device='cuda:0')
rocpd_op: 0
rocpd_api_ops: 0
rocpd_kernelapi: 0
rocpd_copyapi: 0
rocpd_api: 0
rocpd_string: 0
rpd_tracer: finalized in 6.323764 ms
double free or corruption (!prev)
/usr/local/bin/runTracer.sh: line 42: 20 Aborted LD_PRELOAD=librpd_tracer.so "$@"
root@tw024:/ws/Try_rPD# cat matmult_gpu.py
import argparse
import torch
def matmult_gpu(input_data, weights):
"""
Perform matrix multiplication of two tensors on GPU.
Args:
input_data (torch.Tensor): Input tensor.
weights (torch.Tensor): Weight tensor.
Returns:
torch.Tensor: Result of matrix multiplication.
"""
# Creating tensors on GPU
input_data = input_data.to('cuda')
weights = weights.to('cuda')
# Optimized matrix multiplication using torch.matmul
output = torch.matmul(input_data, weights)
return output
if name == "main":
parser = argparse.ArgumentParser(description='Perform matrix multiplication of two tensors.')
parser.add_argument('--x_shape', nargs=2, type=int, default=[1000, 500], metavar=('N', 'M'), help='Shape of input data matrix')
parser.add_argument('--w_shape', nargs=2, type=int, default=[500, 500], metavar=('J', 'K'), help='Shape of weight matrix')
args = parser.parse_args()
input_data = torch.randn(*args.x_shape)
weights = torch.randn(*args.w_shape)
output = matmult_gpu(input_data, weights)
print(f'Shape of input data matrix: {args.x_shape}, weight matrix: {args.w_shape}, result matrix:{output.shape}')
print(output)
Operating System
ubuntu22.04 within docker image rocm/vllm-dev:20241025-tuned
CPU
AMD EPYC 9654 96-Core Processor
GPU
AMD MI300X
ROCm Version
ROCm 6.2.0
ROCm Component
No response
Steps to Reproduce
- Start the container from the image rocm/vllm-dev:20241025-tuned
- Login the container
- Install rocmProfileData in the container.
- Run the trace example
Profiling a PyTorch multiplication function refer to https://github.com/ROCm/rocmProfileData/blob/master/examples/rocm-profile-data/README.md
- Aborted with log "/usr/local/bin/runTracer.sh: line 42: 20 Aborted LD_PRELOAD=librpd_tracer.so "$@""
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
Problem Description
I install rDP and do tracing example follow the README.md. But it run Aborted(failed)
root@tw024:/ws/Try_rPD# runTracer.sh python matmult_gpu.py
Creating empty rpd: trace.rpd
rpd_tracer, because
Shape of input data matrix: [1000, 500], weight matrix: [500, 500], result matrix:torch.Size([1000, 500])
tensor([[ 31.2559, -5.9614, -7.5495, ..., 4.9965, 13.3129, -22.1125],
[ -8.3562, -23.1422, -7.1189, ..., -30.3476, 8.9711, -43.7970],
[ 8.6492, 2.2358, -10.6567, ..., 21.0161, -46.0028, -26.3684],
...,
[-18.7425, -26.5550, -22.3633, ..., 21.0699, 33.3842, -24.6637],
[-37.9485, 16.3621, -19.1744, ..., -0.9327, 1.9820, -13.6000],
[ 11.3354, 22.0743, 20.7730, ..., -0.6945, -12.1807, -11.0098]],
device='cuda:0')
rocpd_op: 0
rocpd_api_ops: 0
rocpd_kernelapi: 0
rocpd_copyapi: 0
rocpd_api: 0
rocpd_string: 0
rpd_tracer: finalized in 6.323764 ms
double free or corruption (!prev)
/usr/local/bin/runTracer.sh: line 42: 20 Aborted LD_PRELOAD=librpd_tracer.so "$@"
root@tw024:/ws/Try_rPD# cat matmult_gpu.py
import argparse
import torch
def matmult_gpu(input_data, weights):
"""
Perform matrix multiplication of two tensors on GPU.
if name == "main":
parser = argparse.ArgumentParser(description='Perform matrix multiplication of two tensors.')
parser.add_argument('--x_shape', nargs=2, type=int, default=[1000, 500], metavar=('N', 'M'), help='Shape of input data matrix')
parser.add_argument('--w_shape', nargs=2, type=int, default=[500, 500], metavar=('J', 'K'), help='Shape of weight matrix')
args = parser.parse_args()
Operating System
ubuntu22.04 within docker image rocm/vllm-dev:20241025-tuned
CPU
AMD EPYC 9654 96-Core Processor
GPU
AMD MI300X
ROCm Version
ROCm 6.2.0
ROCm Component
No response
Steps to Reproduce
Profiling a PyTorch multiplication functionrefer to https://github.com/ROCm/rocmProfileData/blob/master/examples/rocm-profile-data/README.md(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response