Skip to content

[Issue]: runTracer.sh trace aborted (Failed) #68

@alexhegit

Description

@alexhegit

Problem Description

I install rDP and do tracing example follow the README.md. But it run Aborted(failed)

root@tw024:/ws/Try_rPD# runTracer.sh python matmult_gpu.py
Creating empty rpd: trace.rpd
rpd_tracer, because
Shape of input data matrix: [1000, 500], weight matrix: [500, 500], result matrix:torch.Size([1000, 500])
tensor([[ 31.2559, -5.9614, -7.5495, ..., 4.9965, 13.3129, -22.1125],
[ -8.3562, -23.1422, -7.1189, ..., -30.3476, 8.9711, -43.7970],
[ 8.6492, 2.2358, -10.6567, ..., 21.0161, -46.0028, -26.3684],
...,
[-18.7425, -26.5550, -22.3633, ..., 21.0699, 33.3842, -24.6637],
[-37.9485, 16.3621, -19.1744, ..., -0.9327, 1.9820, -13.6000],
[ 11.3354, 22.0743, 20.7730, ..., -0.6945, -12.1807, -11.0098]],
device='cuda:0')
rocpd_op: 0
rocpd_api_ops: 0
rocpd_kernelapi: 0
rocpd_copyapi: 0
rocpd_api: 0
rocpd_string: 0
rpd_tracer: finalized in 6.323764 ms
double free or corruption (!prev)
/usr/local/bin/runTracer.sh: line 42: 20 Aborted LD_PRELOAD=librpd_tracer.so "$@"
root@tw024:/ws/Try_rPD# cat matmult_gpu.py
import argparse
import torch

def matmult_gpu(input_data, weights):
"""
Perform matrix multiplication of two tensors on GPU.

Args:
input_data (torch.Tensor): Input tensor.
weights (torch.Tensor): Weight tensor.

Returns:
torch.Tensor: Result of matrix multiplication.
"""
# Creating tensors on GPU
input_data = input_data.to('cuda')
weights = weights.to('cuda')

# Optimized matrix multiplication using torch.matmul
output = torch.matmul(input_data, weights)

return output

if name == "main":
parser = argparse.ArgumentParser(description='Perform matrix multiplication of two tensors.')
parser.add_argument('--x_shape', nargs=2, type=int, default=[1000, 500], metavar=('N', 'M'), help='Shape of input data matrix')
parser.add_argument('--w_shape', nargs=2, type=int, default=[500, 500], metavar=('J', 'K'), help='Shape of weight matrix')
args = parser.parse_args()

input_data = torch.randn(*args.x_shape)
weights = torch.randn(*args.w_shape)

output = matmult_gpu(input_data, weights)
print(f'Shape of input data matrix: {args.x_shape}, weight matrix: {args.w_shape}, result matrix:{output.shape}')
print(output)

Operating System

ubuntu22.04 within docker image rocm/vllm-dev:20241025-tuned

CPU

AMD EPYC 9654 96-Core Processor

GPU

AMD MI300X

ROCm Version

ROCm 6.2.0

ROCm Component

No response

Steps to Reproduce

  1. Start the container from the image rocm/vllm-dev:20241025-tuned
  2. Login the container
  3. Install rocmProfileData in the container.
  4. Run the trace example Profiling a PyTorch multiplication function refer to https://github.com/ROCm/rocmProfileData/blob/master/examples/rocm-profile-data/README.md
  5. Aborted with log "/usr/local/bin/runTracer.sh: line 42: 20 Aborted LD_PRELOAD=librpd_tracer.so "$@""

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions