-
Notifications
You must be signed in to change notification settings - Fork 552
Add RDNA gfx1151 ROCm target support #2127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lhl
wants to merge
13
commits into
tile-ai:main
Choose a base branch
from
lhl:rdna3-gfx1151
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
1241265
Add RDNA gfx1151 ROCm target support
lhl 152f912
Fix optional annotation handling for gfx1100
lhl 51acb3c
Address CodeRabbit ROCm cleanup feedback
lhl 462e4a0
Scope MFMA tests to CDNA targets
lhl 35d53bf
Scope gfx950 copy async GEMM test
lhl 355ffe2
Reject non-divisible warp layouts
lhl d77e26d
Skip CUDA/CDNA GEMM dtype tests on RDNA
lhl 09d25c9
Scope RDNA device model to gfx11
lhl 55fb3e3
Use RDNA arch helper in block scoring
lhl 2ff7662
Make RDNA tensor intrinsics generation-aware
lhl 6ed0e94
Exclude gfx1010 from SDOT4 guard
lhl 08a0490
Store RDNA tensor instructions as list
lhl ac560ce
Reject unsupported RDNA carver targets
lhl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
lhl marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| from tilelang import tvm as tvm | ||
| from tvm.target import Target | ||
|
|
||
| import tilelang.utils.target as target_utils | ||
| from tilelang.utils.target import ( | ||
| determine_target, | ||
| normalize_rocm_arch, | ||
| rocm_warp_size_for_arch, | ||
| target_get_mcpu, | ||
| target_get_rdna_generation, | ||
| target_get_warp_size, | ||
| target_is_cdna, | ||
| target_is_rdna, | ||
| ) | ||
|
|
||
|
|
||
| def test_normalize_rocm_arch_strips_feature_suffix(): | ||
| assert normalize_rocm_arch("gfx1151:sramecc+:xnack-") == "gfx1151" | ||
| assert normalize_rocm_arch("gfx942") == "gfx942" | ||
| assert normalize_rocm_arch("") is None | ||
| assert normalize_rocm_arch("sm_90") is None | ||
| assert rocm_warp_size_for_arch("gfx1151") == 32 | ||
| assert rocm_warp_size_for_arch("gfx1030") == 32 | ||
| assert rocm_warp_size_for_arch("gfx942") == 64 | ||
|
|
||
|
|
||
| def test_target_mcpu_helpers(): | ||
| target = Target("hip -mcpu=gfx1151:sramecc+:xnack-") | ||
| assert target_get_mcpu(target) == "gfx1151" | ||
|
|
||
|
|
||
| def test_determine_target_adds_rdna_thread_warp_size(): | ||
| target = determine_target("hip -mcpu=gfx1151", return_object=True) | ||
| assert target_get_mcpu(target) == "gfx1151" | ||
| assert int(target.attrs["thread_warp_size"]) == 32 | ||
|
|
||
|
|
||
| def test_auto_target_prefers_rocm_pytorch_over_cuda_toolkit(monkeypatch): | ||
| monkeypatch.setattr(target_utils.torch.version, "hip", "test", raising=False) | ||
| monkeypatch.setattr(target_utils, "check_hip_availability", lambda: True) | ||
| monkeypatch.setattr(target_utils, "check_cuda_availability", lambda: True) | ||
| monkeypatch.setattr(target_utils, "_detect_torch_rocm_arch", lambda: "gfx1151") | ||
|
|
||
| target = determine_target("auto", return_object=True) | ||
| assert target.kind.name == "hip" | ||
| assert target_get_mcpu(target) == "gfx1151" | ||
| assert int(target.attrs["thread_warp_size"]) == 32 | ||
|
|
||
|
|
||
| def test_rdna_gfx1151_target_classification(): | ||
| target = Target("hip -mcpu=gfx1151") | ||
| assert target_is_rdna(target) | ||
| assert not target_is_cdna(target) | ||
| assert target_get_rdna_generation(target) == 11 | ||
| assert target_get_warp_size(target) == 32 | ||
|
|
||
|
|
||
| def test_carver_routes_rdna_without_instantiating_device(monkeypatch): | ||
| import torch | ||
|
|
||
| monkeypatch.setattr(torch.version, "hip", None, raising=False) | ||
| monkeypatch.setattr(torch.cuda, "is_available", lambda: False) | ||
| if hasattr(torch, "mps"): | ||
| monkeypatch.setattr(torch.mps, "is_available", lambda: False, raising=False) | ||
|
|
||
| import tilelang.carver.arch as arch_mod | ||
|
|
||
| def fake_rdna(target): | ||
| return ("rdna", target) | ||
|
|
||
| monkeypatch.setattr(arch_mod, "RDNA", fake_rdna) | ||
| arch = arch_mod.get_arch(Target("hip -mcpu=gfx1151")) | ||
| assert arch[0] == "rdna" | ||
| assert target_get_mcpu(arch[1]) == "gfx1151" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
lhl marked this conversation as resolved.
lhl marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| from __future__ import annotations | ||
| import tvm | ||
| from tvm.target import Target | ||
| from .arch_base import TileDevice | ||
| from .cuda import TensorInstruction | ||
|
|
||
| _RDNA_DEFAULT_LDS_SIZE = 64 * 1024 | ||
|
|
||
|
|
||
| def is_rdna_arch(arch: TileDevice) -> bool: | ||
| return isinstance(arch, RDNA) | ||
|
|
||
|
|
||
| class RDNA(TileDevice): | ||
| def __init__(self, target: Target | str): | ||
| if isinstance(target, str): | ||
| target = tvm.target.Target(target) | ||
| self.target = target | ||
| device = tvm.runtime.rocm(0) | ||
| if not device.exist: | ||
| raise RuntimeError("Cannot find HIP device 0.") | ||
| self.device: tvm.runtime.Device = device | ||
| self.platform: str = "RDNA" | ||
|
|
||
| reported_smem = device.max_shared_memory_per_block | ||
| self.smem_cap = reported_smem if reported_smem > 0 else _RDNA_DEFAULT_LDS_SIZE | ||
| self.compute_max_core = device.multi_processor_count | ||
| self.warp_size = 32 | ||
| self.compute_capability = device.compute_version.replace(".", "") | ||
| self.reg_cap: int = 32768 | ||
| self.max_smem_usage: int = 2 * self.smem_cap | ||
| self.sm_partition: int = 4 | ||
| self.l2_cache_size_bytes: int = getattr(target, "l2_cache_size_bytes", 0) | ||
| self.transaction_size: list[int] = [32, 128] | ||
|
|
||
| # Keep the same units as the existing CUDA/CDNA heuristic. Strix Halo | ||
| # is a UMA part, so use a conservative global-memory score seed. | ||
| self.bandwidth: list[int] = [750, 12080] | ||
| self.available_tensor_instructions: list[TensorInstruction] | None = None | ||
|
|
||
| def get_avaliable_tensorintrin_shapes(self): | ||
| self.available_tensor_instructions = (TensorInstruction("wmma", [16, 16]),) | ||
| return [t.shape for t in self.available_tensor_instructions] | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
|
|
||
| def __repr__(self): | ||
| return f"RDNA({self.target})" | ||
|
|
||
|
|
||
| __all__ = [ | ||
| "is_rdna_arch", | ||
| "RDNA", | ||
| ] | ||
|
lhl marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.