-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/GGA: filter out unsupported resources from component list #10372
Conversation
src/uct/ib/mlx5/gga/gga_mlx5.c
Outdated
} | ||
|
||
for (num_resources = 0, i = 0; i < num_ib_resources; ++i) { | ||
status = uct_ib_mlx5_gga_md_open(component, resources[i].md_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opening the md during query is a too heavy operation, can we just do mlx5dv_open_device and call UCT_IB_MLX5_CMD_OP_QUERY_HCA_CAP? perhaps extract some common code with uct_ib_mlx5_devx_md_open to a helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU mlx5dv_open_device
is a heaviest operation of md_open
, do you think such refactoring makes sense to avoid a couple of malloc/free?
src/uct/ib/mlx5/gga/gga_mlx5.c
Outdated
mlx5_md = ucs_derived_of(md, uct_ib_mlx5_md_t); | ||
if (!ucs_test_all_flags(mlx5_md->flags, UCT_GGA_MLX5_MD_CAPS)) { | ||
ucs_debug("device %s does not match capabilities " | ||
"required for GGA(%"PRIx32"), md_flags=%"PRIx32, | ||
md_name, UCT_GGA_MLX5_MD_CAPS, mlx5_md->flags); | ||
status = UCS_ERR_NO_DEVICE; | ||
uct_ib_mlx5_devx_md_close(&md->super); | ||
goto out_free_dev_list; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why needed?
cc9a420
!ucs_test_all_flags(mlx5_md->flags, UCT_IB_MLX5_MD_FLAG_DEVX | | ||
UCT_IB_MLX5_MD_FLAG_INDIRECT_XGVMI | | ||
UCT_IB_MLX5_MD_FLAG_MMO_DMA)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert to assertion?
/azp run |
Azure Pipelines successfully started running 4 pipeline(s). |
/azp run perf |
Azure Pipelines successfully started running 1 pipeline(s). |
f7ddc81
to
777757e
Compare
What?
Filter out unsupported resources from GGA component
Why?
full log: https://dev.azure.com/ucfconsort/0b36e3f0-8ab9-4a48-b68b-4b2350e02c88/_apis/build/builds/92757/logs/1012
How?
check capabilities