feat(kernels): add TVM FFI Triton CUBIN wrapper#315
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces tvm-ffi as a required dependency for pegainfer-kernels and implements a TVM FFI wrapper layer for Triton AOT CUBIN launchers, specifically exposing the Qwen3.5 gated-delta-rule triangular solve launcher. It also adds corresponding documentation, tests, and an example. Feedback on the implementation highlights a critical runtime issue in triton_cubin.rs where integer arguments are extracted directly as u64 or i32 using AnyView::try_as. Because TVM FFI packed integer arguments are always passed as i64, these strict type-matching checks will fail at runtime. The reviewer recommends extracting them as i64 first before casting.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| fn arg_handle(args: &[AnyView<'_>], spec: TritonCubinFunctionSpec, idx: usize) -> TvmResult<usize> { | ||
| let value = args | ||
| .get(idx) | ||
| .ok_or_else(|| type_error(spec, idx, "an integer or opaque pointer"))?; | ||
| if let Some(raw) = value.try_as::<u64>() { | ||
| return Ok(raw as usize); | ||
| } | ||
| if let Some(raw) = value.try_as::<*mut c_void>() { | ||
| return Ok(raw as usize); | ||
| } | ||
| Err(type_error(spec, idx, "an integer or opaque pointer")) | ||
| } |
There was a problem hiding this comment.
In TVM FFI, packed function arguments of integer type are always passed as signed 64-bit integers (i64). Since AnyView::try_as performs strict type-code matching, attempting to extract them using try_as::<u64>() will fail at runtime with a type error. To ensure robustness, we should attempt to extract the value as i64 first before trying u64 or *mut c_void.
fn arg_handle(args: &[AnyView<'_>], spec: TritonCubinFunctionSpec, idx: usize) -> TvmResult<usize> {
let value = args
.get(idx)
.ok_or_else(|| type_error(spec, idx, "an integer or opaque pointer"))?;
if let Some(raw) = value.try_as::<i64>() {
return Ok(raw as usize);
}
if let Some(raw) = value.try_as::<u64>() {
return Ok(raw as usize);
}
if let Some(raw) = value.try_as::<*mut c_void>() {
return Ok(raw as usize);
}
Err(type_error(spec, idx, "an integer or opaque pointer"))
}| fn arg_i32(args: &[AnyView<'_>], spec: TritonCubinFunctionSpec, idx: usize) -> TvmResult<i32> { | ||
| args.get(idx) | ||
| .and_then(AnyView::try_as::<i32>) | ||
| .ok_or_else(|| type_error(spec, idx, "an integer")) | ||
| } |
There was a problem hiding this comment.
Similar to arg_handle, scalar integer arguments passed from Python/TVM are represented as i64 in the TVM FFI packed call. Calling try_as::<i32>() directly will fail because of strict type-code matching. We should extract the value as i64 first and then cast it to i32.
fn arg_i32(args: &[AnyView<'_>], spec: TritonCubinFunctionSpec, idx: usize) -> TvmResult<i32> {
let value = args
.get(idx)
.ok_or_else(|| type_error(spec, idx, "an integer"))?;
if let Some(raw) = value.try_as::<i64>() {
return Ok(raw as i32);
}
if let Some(raw) = value.try_as::<i32>() {
return Ok(raw);
}
Err(type_error(spec, idx, "an integer"))
}There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eed6eb7505
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| fn arg_i32(args: &[AnyView<'_>], spec: TritonCubinFunctionSpec, idx: usize) -> TvmResult<i32> { | ||
| args.get(idx) | ||
| .and_then(AnyView::try_as::<i32>) | ||
| .ok_or_else(|| type_error(spec, idx, "an integer")) |
There was a problem hiding this comment.
Accept TVM i64 integer arguments
This path rejects the normal TVM integer representation for seq_len and num_value_heads: AnyView::try_as::<i32>() is an exact type check rather than a narrowing conversion, while TVM FFI integer values commonly arrive as i64 (including from Python/TVM typed calls). In that environment the new global function returns a type error before launching the CUBIN even for valid dimensions; parse i64/u64 and range-check before casting to i32.
Useful? React with 👍 / 👎.
Summary
Adds the MVP production-facing TVM FFI path for Triton AOT CUBIN launchers in pegainfer-kernels.
This makes tvm-ffi a required pegainfer-kernels dependency, exposes a packed TVM FFI global wrapper for the generated Qwen3.5 gated-delta-rule Triton launcher, and adds a small example showing registration/discovery of the wrapper ABI.
Changes
Validation
Related
Related to #191.
This is intentionally narrower than #202: #202 validates optional bidirectional interop, while this PR makes tvm-ffi required for pegainfer-kernels and wires the first Triton CUBIN wrapper surface.