[Fix] Allow autotuning kernels with scalar value parameters by yurekami · Pull Request #2136 · tile-ai/tilelang

yurekami · 2026-05-03T03:26:37Z

Summary

Fixes #2081. Any kernel signature that includes a scalar value parameter, e.g.

@tilelang.autotune(configs=[{"threads": 128}, {"threads": 256}], ...)
@tilelang.jit
def test_fun(N=4096, BLOCK_N=512, threads=128):
    @T.prim_func
    def kernel(A: T.Tensor((N,), T.float32), s: T.float32):  # <- scalar value param
        ...
    return kernel

test_fun()(A, 0.1)  # crashes during autotune

cannot currently be autotuned.

Root cause

The autotuner asks the profiler to generate inputs for every parameter via Profiler._get_inputs, which calls get_tensor_supply(...)(param) for each one. For the scalar s, that path landed in tilelang/utils/tensor.py:get_tensor, which unconditionally raised on empty-shape KernelParams:

if hasattr(param, "shape") and not param.shape:
    raise ValueError(
        f"TensorType must have a shape, but got {type(param)}, "
        "likely you are trying to generate a random tensor with a dynamic symbolic shape."
    )

The error message was also misleading — the actual cause is a scalar value parameter, not a dynamic shape (which is detected separately by the tir.Var check below).

Fix

Teach get_tensor to recognize scalar params and return a Python scalar of the matching dtype family (False for bool, 0.0 for floats, 0 for ints) so the autotuner can invoke the kernel during benchmarking. Users that need a specific scalar value can still override per-kernel via supply_prog.

The scalar fast path runs before get_current_device(), so this also makes input generation work on CPU-only hosts when the kernel only takes scalar params.

Test plan

Added testing/python/utils/test_tensor_supply_scalar.py — CUDA-free unit test parameterised over every TensorSupplyType and the common dtype families (float32/16/64, bfloat16, int8/32/64, uint8, bool). Locks in that scalar params yield Python scalars of the right type for every supply variant.
ruff check + ruff format --check clean on both files.
Maintainer to confirm autotune now succeeds on the repro from [BUG] Autotuning + kernel value parameters = failure #2081 (I do not have a CUDA box to run do_bench end-to-end).

Tradeoffs / notes

The default scalar value is 0 / 0.0 / False. This is a deliberate, deterministic choice — randomized scalars would surprise users whose kernels branch on the value. Anyone needing specific scalars should use supply_prog, which already takes precedence over _get_inputs.
The previous error path is removed because it was unreachable for legitimate uses (scalar params now succeed) and its message was actively misleading. The dynamic-shape error path remains untouched.

Closes: #2081

Summary by CodeRabbit

Bug Fixes
- Scalar (zero-dimensional) parameters are now properly handled, returning appropriate Python scalar values instead of errors. This enables the autotuner to work with scalar arguments.
Tests
- Added regression tests validating scalar parameter support across multiple data types and tensor supply modes.

Any kernel signature that includes a scalar value parameter, e.g. def kernel(A: T.Tensor((N,), T.float32), s: T.float32): ... cannot currently be autotuned. The autotuner asks the profiler to generate inputs for every parameter via `Profiler._get_inputs`, which calls `get_tensor_supply(...)(param)` for each. For the scalar `s` that path landed in `tilelang/utils/tensor.py:get_tensor`, which unconditionally raised on empty-shape `KernelParam`s with a misleading "likely you are trying to generate a random tensor with a dynamic symbolic shape" message - even though the actual cause is a scalar value parameter, not a dynamic shape. Teach `get_tensor` to recognize scalar params and return a Python scalar of the matching dtype family (False for bool, 0.0 for floats, 0 for ints) so the autotuner can invoke the kernel during benchmarking. Users that need a specific scalar value can still override per-kernel via `supply_prog`. The scalar fast path runs before `get_current_device()`, so this also makes input generation work on CPU-only hosts when the kernel only takes scalar params. Adds a CUDA-free unit test parameterised over every `TensorSupplyType` and the common dtype families to lock in the regression. Closes: tile-ai#2081

github-actions · 2026-05-03T03:26:46Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2026-05-03T03:26:50Z

📝 Walkthrough

Walkthrough

The pull request adds scalar parameter support to get_tensor_supply() by returning Python scalar values for empty-shaped KernelParam objects instead of raising an error, enabling autotuner invocation with scalar arguments. Comprehensive regression tests validate the change across all TensorSupplyType modes and dtype cases.

Changes

Scalar Parameter Support

Layer / File(s)	Summary
Core Implementation `tilelang/utils/tensor.py`	`get_tensor()` now detects empty-shaped parameters and returns dtype-appropriate Python scalars (`False` for boolean, `0.0` for float, `0` for others) instead of raising `ValueError`.
Regression Tests `testing/python/utils/test_tensor_supply_scalar.py`	Parametrized test cases validate scalar return values across all `TensorSupplyType` modes and dtype strings; additional CPU-only test verifies the scalar fast path does not trigger CUDA dependencies.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A scalar hops into the tuner's embrace,
No longer lost in tensor space,
With Python values, swift and true,
Autotuning dreams come into view!
*—Cottontail, Chief of Code Review 🎯✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[Fix] Allow autotuning kernels with scalar value parameters' clearly and accurately describes the primary change: enabling autotuning for kernels with scalar value parameters.
Linked Issues check	✅ Passed	The pull request fully addresses issue `#2081` by modifying get_tensor() to recognize and handle scalar parameters, returning appropriate Python scalars instead of raising an error.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing scalar parameter handling in autotuning. The modifications to tensor.py and new test module are focused on the stated objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get your free trial and get 200 agent minutes per Slack user (a $50 value).

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

tilelang/utils/tensor.py (2)

46-46: ⚡ Quick win

Update the return type annotation to reflect the scalar path.

get_tensor now returns bool | float | int for scalar params, but the annotation still declares -> torch.Tensor. Static analysis tools and IDE type checkers will flag call sites that receive and use these values.

♻️ Suggested annotation fix

-    def get_tensor(param: KernelParam) -> torch.Tensor:
+    def get_tensor(param: KernelParam) -> "torch.Tensor | bool | float | int":

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tilelang/utils/tensor.py` at line 46, The return type annotation for
get_tensor is outdated: it sometimes returns scalars (bool | float | int) for
scalar KernelParam but is still annotated as -> torch.Tensor; update the
annotation on get_tensor to reflect both tensor and scalar returns (e.g.,
Union[torch.Tensor, bool, float, int] or a named alias like Scalar = bool |
float | int then -> Union[torch.Tensor, Scalar]) so static analyzers and IDEs
recognize scalar return values; reference the function name get_tensor and the
KernelParam type when locating the change.

55-58: 💤 Low value

hasattr guards are unnecessary — simplify the scalar check.

KernelParam is a dataclass that always has a shape field and an is_boolean() method (used unconditionally by the tensor path at lines 75, 99). The double hasattr calls add noise without guarding against any real runtime scenario.

♻️ Suggested simplification

-        if hasattr(param, "shape") and not param.shape:
-            if hasattr(param, "is_boolean") and param.is_boolean():
+        if not param.shape:
+            if param.is_boolean():
                 return False
             return 0.0 if dtype.is_floating_point else 0

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tilelang/utils/tensor.py` around lines 55 - 58, Remove the redundant hasattr
guards around param.shape and param.is_boolean and simplify the scalar check in
the tensor path: treat param as a KernelParam with a concrete shape and call
param.is_boolean() directly; return False for boolean scalars and return 0.0 if
dtype.is_floating_point else 0 for numeric scalars. Update the block referencing
param.shape and param.is_boolean() (in the same function handling KernelParam)
to remove the two hasattr(...) checks and use the direct property/method access
instead.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tilelang/utils/tensor.py`:
- Line 46: The return type annotation for get_tensor is outdated: it sometimes
returns scalars (bool | float | int) for scalar KernelParam but is still
annotated as -> torch.Tensor; update the annotation on get_tensor to reflect
both tensor and scalar returns (e.g., Union[torch.Tensor, bool, float, int] or a
named alias like Scalar = bool | float | int then -> Union[torch.Tensor,
Scalar]) so static analyzers and IDEs recognize scalar return values; reference
the function name get_tensor and the KernelParam type when locating the change.
- Around line 55-58: Remove the redundant hasattr guards around param.shape and
param.is_boolean and simplify the scalar check in the tensor path: treat param
as a KernelParam with a concrete shape and call param.is_boolean() directly;
return False for boolean scalars and return 0.0 if dtype.is_floating_point else
0 for numeric scalars. Update the block referencing param.shape and
param.is_boolean() (in the same function handling KernelParam) to remove the two
hasattr(...) checks and use the direct property/method access instead.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 170a6e5b-854e-4f21-918a-3baef37fc3b9

📥 Commits

Reviewing files that changed from the base of the PR and between 2eec5f0 and ad0eb18.

📒 Files selected for processing (2)

testing/python/utils/test_tensor_supply_scalar.py
tilelang/utils/tensor.py

LeiWang1999 · 2026-05-06T03:38:55Z

@regression-perf

Rachmanino · 2026-05-06T05:17:15Z

Hi @yurekami, thanks for your contribution! However, I think we should have a discussion here to fully figure out the problem.

The key point is that, if a scalar parameter is given, TileLang compiler has no knowledge of its usage in the kernel. What if it acts as sth like an Index or a threshold? As a result, its specific value may affect the workload or even legality of the program (e.g. the number of elements to be processed is relevent to the threshold, or the wrong index results in illegal memory access). Therefore, my idea is that to explicitly reject generating default value for scalar params (see #2084). And if users wanna tune a kernel with scalar param, they are required to provide the value of the scalar param via customized supply program.

Do you agree with this solution? Also welcome for discussion or better ideas!

Triang-jyed-driung · 2026-05-07T16:48:42Z

I think the user is responsible for providing default values for the scalar value parameters during autotuning. Even if it serves as an index or a threshold, the user should have the right to specify the most probable path for subsequent autotuning. For example, if an a: int32 is used in the denominator, the user could specify a as 1 during autotuning, leaving the a==0 branch untouched, since 0 is rarely encountered.
While #2081 shows the solution, it is not elegant. Parameters need to be specified at both @tilelang.autotune (next to the kernel definition) AND set_autotune_inputs (next to the kernel usage). Often, they are miles apart from each other, making the code harder to maintain.

Rachmanino · 2026-05-07T18:21:54Z

tilelang.autotune is not necessarily used as a decorator. also i remember there’s a param called supply_fn in tilelang.autotune (not sure). Also glad to here if you have a better solution :D

…

---- Replied Message ---- | From | ***@***.***> | | Date | 05/08/2026 00:49 | | To | tile-ai/tilelang ***@***.***> | | Cc | Tong ***@***.***>, Review ***@***.***> | | Subject | Re: [tile-ai/tilelang] [Fix] Allow autotuning kernels with scalar value parameters (PR #2136) | Triang-jyed-driung left a comment (tile-ai/tilelang#2136) I think the user is responsible for providing default values for the scalar value parameters during autotuning. Even if it serves as an index or a threshold, the user should have the right to specify the most probable path for subsequent autotuning. For example, if an a: int32 is used in the denominator, the user could specify a as 1 during autotuning, leaving the a==0 branch untouched, since 0 is rarely encountered. While #2081 shows the solution, it is not elegant. Parameters need to be specified at both @tilelang.autotune (next to the kernel definition) AND set_autotune_inputs (next to the kernel usage). Often, they are miles apart from each other, making the code harder to maintain. — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because your review was requested.Message ID: ***@***.***>

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

LeiWang1999 requested a review from Rachmanino May 6, 2026 03:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Allow autotuning kernels with scalar value parameters#2136

[Fix] Allow autotuning kernels with scalar value parameters#2136
yurekami wants to merge 1 commit intotile-ai:mainfrom
yurekami:fix-autotune-scalar-value-params

yurekami commented May 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

github-actions Bot commented May 3, 2026

Uh oh!

coderabbitai Bot commented May 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

LeiWang1999 commented May 6, 2026

Uh oh!

Rachmanino commented May 6, 2026

Uh oh!

Triang-jyed-driung commented May 7, 2026

Uh oh!

Rachmanino commented May 7, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yurekami commented May 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Test plan

Tradeoffs / notes

Summary by CodeRabbit

Uh oh!

github-actions Bot commented May 3, 2026

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

LeiWang1999 commented May 6, 2026

Uh oh!

Rachmanino commented May 6, 2026

Uh oh!

Triang-jyed-driung commented May 7, 2026

Uh oh!

Rachmanino commented May 7, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yurekami commented May 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 3, 2026 •

edited

Loading