[PAL] Support Device API Transport#445
Open
MC952-arch wants to merge 2 commits intoflagos-ai:mainfrom
Open
Conversation
7aae89a to
84a262a
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the device-side abstraction layer to use a unified Device Transport interface (replacing flagcxDevNet) and aligns barrier/session tagging accordingly, enabling transport-backed load/store and one-sided operations across vendor and fallback backends.
Changes:
- Replaced device-side
flagcxDevNetusage withflagcxDevTransportacross device API kernels and wrappers. - Unified barrier tagging by switching from the old barrier-tag types to
flagcxTeamTag{Intra,Inter,World}. - Added/propagated
Transportsupport in comm traits (vendor + fallback) and ensuredflagcxDevMempreserves an explicit raw pointer.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
flagcx/kernels/device_api.cu |
Migrates kernels from DevNet to DevTransport (put/signal/wait/flush + intra load/store). |
flagcx/kernels/custom_allreduce.cu |
Adds device_utils.h include to ensure device macro/type availability. |
flagcx/adaptor/include/device_api/nvidia_comm_traits.h |
Renames/introduces vendor Transport, updates barrier specializations, and adjusts window/coop definitions. |
flagcx/adaptor/include/device_api/flagcx_device.h |
Replaces flagcxDevNet wrapper with flagcxDevTransport; updates barrier wrappers and tag usage; stores flagcxDevMem raw ptr explicitly. |
flagcx/adaptor/include/device_api/fallback_comm_traits.h |
Renames fallback Net to Transport and updates action/signal/counter types accordingly. |
flagcx/adaptor/include/device_api/comm_traits.h |
Renames one-sided action types to flagcxDevTransport_* and replaces barrier tags with flagcxTeamTag*. |
flagcx/adaptor/flagcx_device.cc |
Initializes nInterPeers for vendor devComm path to enable multi-node transport/barrier behavior in kernels. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5c78fff to
ebd6466
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e6f2775 to
77240ce
Compare
77240ce to
22497ed
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Category
PAL
PR Types
New Features
PR Description
This PR updates the device-side abstraction layer to use a unified Device Transport interface (replacing flagcxDevNet) and aligns barrier/session tagging accordingly, enabling transport-backed load/store and one-sided operations across vendor and fallback backends.