Skip to content

lttng: Add req_type to ProcessCompletions trace #877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AvivBenchorin
Copy link
Contributor

In order to more easily differentiate between RDMA transport ProcessCompletions traces associated with send requests or with recv requests, this adds the request type (enum nccl_net_ofi_rdma_req_type_t) as a field to the LTTng tracepoint. Now, when processing RDMA transport ProcessCompletions traces you can can identify send requests which will have req_type NCCL_OFI_RDMA_SEND (enum value of 2) and recv requests which have req_type NCCL_OFI_RDMA_RECV (enum value of 3).

For SENDRECV transport ProcessCompletions traces, req_type is set to -1 since the SENDRECV transport does not have a request type enum.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

In order to more easily differentiate between RDMA transport
ProcessCompletions traces associated with send requests or with recv
requests, this adds the request type (enum nccl_net_ofi_rdma_req_type_t)
as a field to the LTTng tracepoint. Now, when processing
RDMA transport ProcessCompletions traces you can can identify send
requests which will have req_type NCCL_OFI_RDMA_SEND (enum value of 2)
and recv requests which have req_type NCCL_OFI_RDMA_RECV (enum value of
3).

For SENDRECV transport ProcessCompletions traces, req_type is set to
-1 since the SENDRECV transport does not have a request type enum.

Signed-off-by: Aviv Benchorin <[email protected]>
@AvivBenchorin AvivBenchorin requested a review from rajachan May 9, 2025 17:38
@AvivBenchorin AvivBenchorin requested a review from a team as a code owner May 9, 2025 17:38
@@ -188,7 +188,7 @@ static int sendrecv_req_handle_cq_entry(nccl_net_ofi_context_t *ctx,

nccl_net_ofi_sendrecv_req_t *req = container_of(ctx, nccl_net_ofi_sendrecv_req_t, ctx);

NCCL_OFI_TRACE_COMPLETIONS_SENDRECV(req->dev_id, req, &ctx->ofi_ctx);
NCCL_OFI_TRACE_COMPLETIONS_SENDRECV(req->dev_id, req, -1, &ctx->ofi_ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we can use cq_entry->flags to differentiate between request types for SENDRECV.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a possibility, I tried replacing the -1 with cq_entry->flags for SENDRECV and it looks like the cq_entry->flags value here for send requests is 2058 for SEND requests (FI_SEND, FI_TAGGED, and FI_MSG) and 1034 for RECV requests (FI_RECV, FI_TAGGED, and FI_MSG). Libfabric flags defined here.

ProcessCompletions: { cpu_id = 22 }, { dev = 0, request = 0x79024BEC3F80, req_type = 2058, ctx = 133050770669456 }
...
ProcessCompletions: { cpu_id = 22 }, { dev = 0, request = 0x7901B0154EF0, req_type = 1034, ctx = 133048156114688 }
...

If representing the cq_entry->flags value as the req_type field would be appropriate I could reuse the same ProcessCompletions tracepoint event between RDMA and SENDRECV transports, otherwise I might need to create a separate ProcessCompletionsSendRecv tracepoint event that has a cq_entry_flags field instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The analog of RDMA's req->type is SENDRECV's req->direction (link); I would use that here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants