[PyT] Plumbing correct bias dims from TE to cudnn, while adding support for additional bias shapes by KshitijLakhani · Pull Request #2537 · NVIDIA/TransformerEngine

KshitijLakhani · 2025-12-20T00:55:39Z

Description

TE common was not plumbing attention vector bias dimensions correctly to cuDNN.
Instead of using shape from Bias, i.e. [bias_sq, bias_skv] it was using [sq, skv] thereby passing larger than required dims. This PR correctly plumbs the bias shape from TE PyT to cuDNN via TE common.

Additionally, this PR also adds support for dbias , i.e, bias grad (fwd+bwd) calculation for b1ss, bhss, 11ss (initially only 1hss was supported) for CP and non-CP cases.
Support for bias calculation , i.e. no bias grad (fwd only) for 111s is also added for CP and non-CP cases
(bwd support to be added once cuDNN start supporting it in the future - TODOs sprinkled in code for the same)

Lastly, tests are added to support all newly added functionality for both CP and non-CP cases

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Pass bias_sq and bias_skv to fused_attn_arbitrary_seqlen_fwd_impl() and fused_attn_arbitrary_seqlen_bwd_impl()
Add new entries for bias_sq and bias_skv in FADescriptor_v1
Correct the bias passed to the MHA cuDNN graph to use bias_sq and bias_skv instead of s_q and s_kv
Enable dbias calculation for all cuDNN supported shapes : 1hss, 11ss, b1ss, bhss
Add TODOs for when cuDNN starts supporting dbias calculation for bias shape 111s

Testing:

Added tests (fwd only and no bias grad) for 111s bias shape in both, non-CP and CP fused attn tests
Added tests for 1hss, b1ss, bhss, 111s bias shapes in CP fused attn tests (non-CP already has tests for all other supported shapes)
Confirmed by using NVTE_DEBUG and additional test logging that the same test bias shape passes from PyT layer to cuDNN (this was necessary as there were hard coded shapes that would show a false positive thereby masking actual behavior)

Supplementary testing:

Using the reproducer : https://github.com/cyanguwa/TransformerEngine/tree/test_111s for bias [1,1,1,s] it can be seen in the cuDNN FE logs that prior to this PR the bias dims passed onto cuDNN from TE were
{"data_type":null,"dim":[1,1,128,128],"is_pass_by_value":false,"is_virtual":false,"name":"bias","pass_by_value":null,"reordering_type":"NONE","stride":[16384,16384,128,1],"uid":0,"uid_assigned":false},
and after this PR they are:
"bias":{"data_type":null,"dim":[1,1,1,128],"is_pass_by_value":false,"is_virtual":false,"name":"bias","pass_by_value":null,"reordering_type":"NONE","stride":[128,128,128,1],"uid":0,"uid_assigned":false},

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

KshitijLakhani · 2025-12-21T01:17:26Z

/te-ci pytorch L0 L1

greptile-apps · 2025-12-22T18:27:14Z

Greptile Summary

This PR fixes a bug where TE was passing incorrect bias dimensions (s_q, s_kv) to cuDNN instead of the actual bias tensor dimensions (bias_sq, bias_skv). The fix correctly extracts and plumbs these dimensions through the entire stack from PyTorch to cuDNN.

Key changes:

Adds bias_sq and bias_skv fields to FADescriptor_v1 and threads them through forward/backward functions
Updates cuDNN graph tensor creation to use actual bias dimensions instead of query/key-value sequence lengths
Enables dbias (bias gradient) support for additional shapes: b1ss, bhss, 11ss (previously only 1hss was supported)
Adds forward-only support for 111s bias shape (dbias not supported by cuDNN 9.18)
Special handling in Context Parallel for 111s shape: only splits s_kv dimension since s_q=1
Comprehensive test coverage for all supported bias shapes in both CP and non-CP modes

Testing:

All previously reported issues from review threads have been addressed
Tests verify correct dimension plumbing using debug logging
Forward-only tests for 111s with TODOs for future cuDNN dbias support

Confidence Score: 5/5

This PR is safe to merge with no blocking issues
The changes correctly fix a dimension-passing bug, add proper support for additional bias shapes, include comprehensive test coverage, and all previously identified issues from review threads have been addressed. The code properly handles the cuDNN limitation for 111s dbias with clear TODOs for future support.
No files require special attention

Important Files Changed

Filename	Overview
transformer_engine/common/fused_attn/fused_attn_f16_arbitrary_seqlen.cu	Correctly plumbs `bias_sq` and `bias_skv` dimensions to cuDNN instead of using `s_q` and `s_kv`. Adds conditional `dBias` tensor creation for supported shapes (excludes `111s`).
transformer_engine/common/fused_attn/utils.h	Adds `bias_sq` and `bias_skv` fields to `FADescriptor_v1` struct and updates comparison operator for cache key correctness.
transformer_engine/pytorch/attention/dot_product_attention/utils.py	Changes condition from generic non-`1hss` check to specific `111s+requires_grad` check. Uses `elif` for mutually exclusive conditions.
transformer_engine/pytorch/attention/dot_product_attention/context_parallel.py	Adds special handling for `111s` bias shape in CP: only splits `s_kv` dimension (not `s_q`), and conditionally creates `attn_dbias_` only for non-`111s` shapes.
tests/pytorch/attention/run_attention_with_cp.py	Adds `is_training` parameter, bias shape map, conditional grad computation. Fixes dbias comparison to use correct split dimension (2 instead of 3) after reshape.

Flowchart

flowchart TD
    A[PyTorch Layer: DotProductAttention] -->|Pass bias tensor| B[Python utils.py]
    B -->|Extract bias_sq, bias_skv from tensor.shape| C[Call C++ API]
    C -->|fused_attn_arbitrary_seqlen_fwd/bwd| D[C++: fused_attn_f16_arbitrary_seqlen.cu]
    D -->|Store in FADescriptor_v1| E[utils.h: Cache key]
    D -->|Create cuDNN graph tensors| F{Bias shape check}
    F -->|111s shape?| G[Forward only: skip dBias tensor]
    F -->|Other shapes: 1hss, 11ss, b1ss, bhss| H[Forward + Backward: create dBias tensor]
    G -->|Use bias_sq, bias_skv dims| I[cuDNN FE Graph]
    H -->|Use bias_sq, bias_skv dims| I
    I -->|Execute| J[cuDNN Kernel]
    
    K[Context Parallel Flow] -->|Split sequence dims| L{Bias shape?}
    L -->|111s: sq=1| M[Split only s_kv dimension]
    L -->|Other shapes| N[Split both sq and s_kv dimensions]
    M --> O[CP Forward/Backward]
    N --> O
    O -->|Gather results| P[Compare with non-CP]

_{Last reviewed commit: ff174a8}

greptile-apps · 2025-12-22T18:27:15Z

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

cyanguwa · 2025-12-22T18:57:12Z

Looks good - please pick the 111s test from my branch as well. Thanks!

greptile-apps

Greptile Overview

Greptile Summary

Fixes bias dimension handling in fused attention by plumbing actual bias tensor dimensions (bias_sq, bias_skv) from input tensors through to cuDNN, replacing the previous incorrect usage of query/key sequence lengths (s_q, s_kv). This resolves dimension mismatches for broadcasted bias shapes like [1,1,1,s] where the bias dimensions are smaller than the attention matrix dimensions. The fix enables gradient computation for non-1hss bias shapes by removing the backward pass restriction in the Python layer.

Confidence Score: 4/5

Safe to merge after addressing minor consistency concern in backward pass dimension extraction
The core fix correctly addresses the bias dimension bug by extracting actual tensor shapes instead of using sequence lengths. The implementation is consistent across forward pass, backward pass, and FP8 paths. Test coverage has been expanded to validate the fix. One minor style issue: backward pass extracts bias_b/bias_h from output_dBias but bias_sq/bias_skv from input_Bias, creating potential inconsistency if shapes don't match, though this is unlikely in practice.
transformer_engine/common/fused_attn/fused_attn_f16_arbitrary_seqlen.cu for dimension extraction consistency in backward pass

Important Files Changed

File Analysis

Filename	Score	Overview
transformer_engine/common/fused_attn/utils.h	5/5	Adds `bias_sq` and `bias_skv` fields to FADescriptor_v1 struct and updates comparison operator
transformer_engine/common/fused_attn/fused_attn_f16_arbitrary_seqlen.cu	4/5	Updates fwd/bwd implementations to extract and use actual bias dimensions from input tensors instead of query/key sequence lengths
transformer_engine/pytorch/attention/dot_product_attention/utils.py	4/5	Removes restriction preventing bias gradient computation for non-1hss bias shapes, enabling backward pass support

Sequence Diagram

sequenceDiagram
    participant Py as Python Layer
    participant TE as TE Common (CUDA)
    participant cuDNN as cuDNN Backend
    
    Note over Py,cuDNN: Bias Dimension Propagation Fix
    
    Py->>TE: Pass bias tensor [b, h, bias_sq, bias_skv]
    Note over TE: Extract actual bias dims<br/>bias_sq = input_Bias->shape[2]<br/>bias_skv = input_Bias->shape[3]
    
    TE->>TE: Store in FADescriptor_v1<br/>(bias_sq, bias_skv)
    
    alt Before Fix
        Note over TE: Used s_q, s_kv incorrectly<br/>(e.g., [1,1,128,128] for [1,1,1,128])
    end
    
    alt After Fix
        Note over TE: Uses bias_sq, bias_skv correctly<br/>(e.g., [1,1,1,128] for [1,1,1,128])
    end
    
    TE->>cuDNN: Create bias tensor with<br/>dim={bias_b, bias_h, bias_sq, bias_skv}
    TE->>cuDNN: Create dBias tensor with same dims
    
    cuDNN->>TE: Compute attention + gradients
    TE->>Py: Return output with correct bias gradients

greptile-apps · 2026-01-09T02:23:45Z

    bias_b = output_dBias->data.shape[0];
    bias_h = output_dBias->data.shape[1];
+    bias_sq = input_Bias->data.shape[2];
+    bias_skv = input_Bias->data.shape[3];


Bias dimensions are sourced from different tensors: bias_b and bias_h from output_dBias, while bias_sq and bias_skv from input_Bias. This assumes both tensors have matching shapes. Consider extracting all dimensions from the same tensor (preferably input_Bias for consistency with forward pass) or adding a validation check that shapes match.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Addressed in SHA 143ede5

greptile-apps

Greptile Overview

Greptile Summary

Fixes bias dimension plumbing from TransformerEngine to cuDNN by passing actual bias tensor dimensions (bias_sq, bias_skv) instead of sequence dimensions (s_q, s_kv). This resolves incorrect bias shape information being sent to cuDNN, particularly noticeable for bias shapes like [1,1,1,s] where the bias sequence dimensions differ from query/key/value sequence lengths. The fix enables cuDNN backend support for bias gradient computation in previously unsupported shapes.

Confidence Score: 5/5

Safe to merge - correct bug fix with comprehensive test coverage and no breaking changes
This PR correctly fixes the bias dimension plumbing issue where TE was incorrectly passing sequence dimensions instead of actual bias dimensions to cuDNN. The fix is well-implemented across all affected code paths (F16 and FP8), properly extracts bias dimensions from input tensors, and includes comprehensive test coverage. No functional issues or edge cases were identified.
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
transformer_engine/common/fused_attn/utils.h	5/5	Added bias_sq and bias_skv fields to FADescriptor_v1 struct and updated comparison operator
transformer_engine/common/fused_attn/fused_attn_f16_arbitrary_seqlen.cu	5/5	Updated forward and backward implementations to extract and pass correct bias dimensions from input tensors to cuDNN
transformer_engine/pytorch/attention/dot_product_attention/utils.py	5/5	Removed restriction that disabled FusedAttention for bias gradients in non-1hss shapes, enabling cuDNN backend for these cases

Sequence Diagram

sequenceDiagram
    participant PyTorch as PyTorch Layer
    participant Utils as utils.py
    participant F16Impl as fused_attn_f16<br/>arbitrary_seqlen.cu
    participant Descriptor as FADescriptor_v1
    participant cuDNN as cuDNN FE Graph

    Note over PyTorch,cuDNN: Forward Pass with Bias [1,1,1,s]
    
    PyTorch->>Utils: get_attention_backend()<br/>check bias support
    Utils->>Utils: Enable cuDNN for<br/>bias gradient
    PyTorch->>F16Impl: fused_attn_arbitrary_seqlen_fwd()<br/>with input_Bias tensor
    F16Impl->>F16Impl: Extract bias dimensions:<br/>bias_sq = input_Bias.shape[2]<br/>bias_skv = input_Bias.shape[3]
    F16Impl->>Descriptor: Create FADescriptor_v1<br/>with bias_sq, bias_skv
    F16Impl->>cuDNN: Create bias tensor with<br/>dim=[bias_b, bias_h, bias_sq, bias_skv]
    Note over cuDNN: Correct dims [1,1,1,s]<br/>instead of [1,1,s,s]
    cuDNN-->>F16Impl: Execute attention
    F16Impl-->>PyTorch: Return output

KshitijLakhani · 2026-01-09T02:30:29Z

/te-ci pytorch L0 L1

greptile-apps

Greptile Overview

Greptile Summary

This PR fixes a bug where TransformerEngine was incorrectly passing attention bias dimensions to cuDNN. Instead of using the actual bias tensor dimensions [bias_sq, bias_skv], it was using the full sequence dimensions [s_q, s_kv], which could be larger than the bias tensor.

Major Changes

Core Fix: Extract and pass actual bias dimensions (bias_sq, bias_skv) from the bias tensor shape throughout the call chain to cuDNN
Struct Update: Added bias_sq and bias_skv fields to FADescriptor_v1 for proper caching
Test Enhancement: Added bias gradient tracking and comparison in context parallelism tests
Backend Selection: Removed incorrect logic that disabled FusedAttention for non-1hss bias shapes when gradients weren't required

Issues Found

Critical Bug in Tests: run_attention_with_cp.py attempts to access bias.grad when bias is None (lines 342, 438), causing AttributeError for "no_bias" and "alibi" test cases

Confidence Score: 3/5

This PR fixes an important bug in bias dimension handling but introduces critical test failures
The core fix correctly addresses the bias dimension bug and is well-implemented across the C++/CUDA codebase. However, the test changes contain logic errors that will cause AttributeError when running tests with "no_bias" or "alibi" configurations, preventing proper validation of the fix.
Pay close attention to tests/pytorch/attention/run_attention_with_cp.py which has critical bugs on lines 342 and 438

Important Files Changed

File Analysis

Filename	Score	Overview
transformer_engine/common/fused_attn/utils.h	5/5	Added `bias_sq` and `bias_skv` fields to `FADescriptor_v1` struct and updated the comparison operator. Changes are straightforward and correctly implemented.
transformer_engine/common/fused_attn/fused_attn_f16_arbitrary_seqlen.cu	5/5	Correctly extracts `bias_sq` and `bias_skv` from `input_Bias->data.shape` and passes them through the call chain to cuDNN. Bias tensor dimensions and strides are properly updated to use actual bias dimensions instead of sequence lengths.
tests/pytorch/attention/run_attention_with_cp.py	2/5	Adds bias gradient tracking and comparison logic for context parallelism tests. Contains critical bugs where `bias.grad` and `bias_.grad` are accessed when bias is `None`, causing `AttributeError`. Also adds proper reshaping logic for dbias comparison.

Sequence Diagram

sequenceDiagram
    participant Python as Python Layer<br/>(utils.py)
    participant ArbitraryFwd as fused_attn_arbitrary_seqlen_fwd<br/>(C++ wrapper)
    participant ArbitraryFwdImpl as fused_attn_arbitrary_seqlen_fwd_impl<br/>(C++ implementation)
    participant cuDNN as cuDNN Graph
    
    Note over Python,cuDNN: Forward Pass with Bias [1, 1, 1, s_kv]
    
    Python->>ArbitraryFwd: input_Bias tensor with shape [b, h, sq, skv]
    ArbitraryFwd->>ArbitraryFwd: Extract bias_b = input_Bias->shape[0]<br/>bias_h = input_Bias->shape[1]<br/>bias_sq = input_Bias->shape[2]<br/>bias_skv = input_Bias->shape[3]
    ArbitraryFwd->>ArbitraryFwdImpl: Pass bias_b, bias_h, bias_sq, bias_skv
    ArbitraryFwdImpl->>ArbitraryFwdImpl: Store in FADescriptor_v1 for caching
    ArbitraryFwdImpl->>cuDNN: Create bias tensor with dimensions<br/>[bias_b, bias_h, bias_sq, bias_skv]<br/>Previously used [bias_b, bias_h, s_q, s_kv] ❌
    Note over cuDNN: Now receives correct bias dimensions ✓
    
    Note over Python,cuDNN: Backward Pass
    ArbitraryFwd->>ArbitraryFwd: Extract from output_dBias->shape
    ArbitraryFwd->>ArbitraryFwdImpl: Pass bias_sq, bias_skv
    ArbitraryFwdImpl->>cuDNN: Set dBias dimensions to [bias_b, bias_h, bias_sq, bias_skv]
    Note over cuDNN: dBias only computed if (bias_b==1 && bias_h==h)

greptile-apps · 2026-01-09T02:33:38Z

        else:
            out.backward(dout)
-    dq, dk, dv = q.grad, k.grad, v.grad
+    dq, dk, dv, dbias = q.grad, k.grad, v.grad, bias.grad


bias is None when attn_bias_type is "no_bias" or "alibi" (line 312), so bias.grad will raise AttributeError

Suggested change

dq, dk, dv, dbias = q.grad, k.grad, v.grad, bias.grad

dq, dk, dv, dbias = q.grad, k.grad, v.grad, bias.grad if bias is not None else None

greptile-apps · 2026-01-09T02:33:39Z

        else:
            out_.backward(dout_)
-    dq_, dk_, dv_ = q_.grad, k_.grad, v_.grad
+    dq_, dk_, dv_, dbias_ = q_.grad, k_.grad, v_.grad, bias_.grad


bias_ is None when bias is None (line 355), so bias_.grad will raise AttributeError

Suggested change

dq_, dk_, dv_, dbias_ = q_.grad, k_.grad, v_.grad, bias_.grad

dq_, dk_, dv_, dbias_ = q_.grad, k_.grad, v_.grad, bias_.grad if bias_ is not None else None

cyanguwa · 2026-01-09T15:29:45Z

+                dbias.shape[2] // (2 * world_size),
+                dbias.shape[3],
+            )
+            # bias has fixed axis (2) as dbias shape: (1, 1, max_seqlen_q, max_seqlen_kv)


I think our CP implementation (after your C changes) should support all bias shapes, not just 111s. I also think your reshaping here should work for all shapes. Could you run the tests to confirm?

KshitijLakhani · 2026-01-21T19:42:57Z

/te-ci pytorch L0 L1

KshitijLakhani · 2026-01-22T00:48:13Z

/te-ci pytorch L0 L1

greptile-apps

_{7 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-06T18:34:05Z

Additional Comments (1)

transformer_engine/pytorch/attention/dot_product_attention/utils.py
Incorrect fused-attn gating

This block disables FusedAttention only for 111s when bias.requires_grad, but it no longer disables fused attention for other non-1hss bias shapes with requires_grad=True (e.g. 11ss, b1ss, bhss). However fused_attn_f16_arbitrary_seqlen.cu still skips wiring dBias for some shapes, so leaving fused attention enabled here can silently produce missing/incorrect bias.grad (or a null dBias binding) when users request dbias for those shapes.

Concrete fix: restore explicit gating that matches the C++ set_dbias(...) condition, i.e. disable fused attention whenever dbias won’t be computed for the given fu_core_attention_bias_shape but fu_core_attention_bias_requires_grad is true.

greptile-apps

_{4 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-06T18:47:06Z

+        # For 1hss, 11ss, b1ss, bhss
+        shape_cache = shape
        shape = shape.replace("_s_s", "_sq_skv")
+        if shape==shape_cache:


Missing spaces around == operator violates PEP 8 style

Suggested change

if shape==shape_cache:

if shape == shape_cache:

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-06T19:05:00Z

+        if not fu_core_attention_bias_requires_grad:
            # max512 backend will only support [1, h, s, s]
            os.environ["NVTE_FUSED_ATTN_BACKEND"] = "1"


Independent condition - executed regardless of whether line 962's condition was true. This means NVTE_FUSED_ATTN_BACKEND=1 is set for all non-requires_grad bias cases, not just when use_fused_attention is still true. If line 964 set use_fused_attention=False, this environment variable shouldn't be set.

Consider changing to elif:

Suggested change

if not fu_core_attention_bias_requires_grad:

# max512 backend will only support [1, h, s, s]

os.environ["NVTE_FUSED_ATTN_BACKEND"] = "1"

elif not fu_core_attention_bias_requires_grad:

# max512 backend will only support [1, h, s, s]

os.environ["NVTE_FUSED_ATTN_BACKEND"] = "1"

I think it should be "elif" here. Could you double check the logic here please?

You're correct - it should be elif. Here's the logic issue:

Current code flow:

if fu_core_attention_bias_requires_grad and fu_core_attention_bias_shape == "111s": use_fused_attention = False # Line 964 disables fused attention if not fu_core_attention_bias_requires_grad: # Line 965 - always evaluated independently os.environ["NVTE_FUSED_ATTN_BACKEND"] = "1"

While the current code technically works (the conditions are mutually exclusive), using independent if statements obscures the intent. These are two branches handling the same scenario:

Branch 1: If bias requires grad AND shape is "111s" → disable fused attention

Branch 2: If bias doesn't require grad → use max512 backend

Using elif makes this relationship explicit and prevents future bugs if someone modifies the conditions.

Suggested fix:

Suggested change

if not fu_core_attention_bias_requires_grad:

# max512 backend will only support [1, h, s, s]

os.environ["NVTE_FUSED_ATTN_BACKEND"] = "1"

# dbias calculation is not supported for 111s as of cuDNN 9.18. So, use fused attention backend only if bias does not require grad.

if fu_core_attention_bias_requires_grad and fu_core_attention_bias_shape == "111s":

logger.warning("Disabling FusedAttention as dbias calculation is not supported for 111s")

use_fused_attention = False

elif not fu_core_attention_bias_requires_grad:

This ensures we only set the backend environment variable when we haven't already disabled fused attention, making the code's intent clearer.

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

for more information, see https://pre-commit.ci

greptile-apps

_{10 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

KshitijLakhani · 2026-02-13T23:12:05Z

/te-ci L0 L1

cyanguwa · 2026-02-14T02:51:12Z

+                "and [1,1,sq,skv], [1,h,sq,skv], [b,1,sq,skv], [b,h,sq,skv] for backward!"
            )
+            # For all bias shapes except 111s, sq must be divisible by 2 and sk must be divisible by 2*cp_size
+            # For bias shape 111s, only sq must be divisible by 2


Should it be:
For bias shape 111s, only skv must be divisible by 2*cp_size

Thanks for catching this
Fixed it

cyanguwa · 2026-02-14T02:53:19Z

+                )
+            else:
+                assert attn_bias.shape[-1] % (2 * cp_size) == 0, "Sequence length does not meet divisible requirements!"
+                # [b, h, sq, sk] -> [b, h, sq, 2*cp, sk//(2*cp)]


Will you be missing a dimension, or will it be fine if no index_select is done?

If I understand right, your concern is if the consumer of attn_bias_ is aware of the missing dimensions or not ? Let me try answering this:

So, the only consumer of this attn_bias_ is in cp_p2p_fwd_fused_attn()

In cp_p2p_fwd_fused_attn() lines L844-860 take care of the concatenation based on 111s shape vs non-111s shape for bias. After the concatenation, the final shape created for attn_bias_inputs is [b, h, sq_chunk, sk_chunk], with sq_chunk being 1 for 111s, and, non-1 for non-111s shapes.
And I believe, this is fine as the broadcast for this is then handled downstream.

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

greptile-apps

_{10 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T19:41:33Z

+                " [b,h,sq,skv] for backward!"
            )
+            # For all bias shapes except 111s, sq must be divisible by 2 and skv must be divisible by 2*cp_size
+            # For bias shape 111s, only skv must be divisible by 2


Inaccurate divisibility comment

The comment says only skv must be divisible by 2, but the actual assertion on line 1476 checks attn_bias.shape[-1] % (2 * cp_size) == 0, meaning skv must be divisible by 2 * cp_size, not just 2. For cp_size > 1 (e.g., 4), this is a significantly stricter requirement.

Suggested change

# For bias shape 111s, only skv must be divisible by 2

# For all bias shapes except 111s, sq must be divisible by 2 and skv must be divisible by 2*cp_size

# For bias shape 111s, only skv must be divisible by 2*cp_size

greptile-apps · 2026-02-17T19:41:34Z

            assert len(attn_bias.shape) == 4, (
-                "Only support bias shape of [b, h, sq, sk] for forward, "
-                "and [1, h, sq, sk] for backward!"
-            )
-            assert (
-                attn_bias.shape[-2] % 2 == 0 and attn_bias.shape[-1] % (2 * cp_size) == 0
-            ), "Sequence length does not meet divisible requirements!"
-            # [b, h, sq, sk] -> [b, h, 2, sq//2, 2*cp, sk//(2*cp)]
-            attn_bias_ = attn_bias.view(
-                *attn_bias.shape[:-2],
-                2,
-                attn_bias.shape[-2] // 2,
-                2 * cp_size,
-                attn_bias.shape[-1] // (2 * cp_size),
+                "Only support bias shape of [1,1,sq,skv], [1,h,sq,skv], [b,1,sq,skv], [b,h,sq,skv],"
+                " [1,1,sq,skv] for forward, and [1,1,sq,skv], [1,h,sq,skv], [b,1,sq,skv],"
+                " [b,h,sq,skv] for backward!"
            )


Duplicate and missing shape in assert error message

The error message lists [1,1,sq,skv] twice in the forward section and never mentions [1,1,1,skv] (the 111s shape that was just added). A developer hitting this assert won't know 111s is actually supported.

Suggested change

assert len(attn_bias.shape) == 4, (

"Only support bias shape of [b, h, sq, sk] for forward, "

"and [1, h, sq, sk] for backward!"

)

assert (

attn_bias.shape[-2] % 2 == 0 and attn_bias.shape[-1] % (2 * cp_size) == 0

), "Sequence length does not meet divisible requirements!"

# [b, h, sq, sk] -> [b, h, 2, sq//2, 2*cp, sk//(2*cp)]

attn_bias_ = attn_bias.view(

*attn_bias.shape[:-2],

2,

attn_bias.shape[-2] // 2,

2 * cp_size,

attn_bias.shape[-1] // (2 * cp_size),

"Only support bias shape of [1,1,sq,skv], [1,h,sq,skv], [b,1,sq,skv], [b,h,sq,skv],"

" [1,1,sq,skv] for forward, and [1,1,sq,skv], [1,h,sq,skv], [b,1,sq,skv],"

" [b,h,sq,skv] for backward!"

)

assert len(attn_bias.shape) == 4, (

"Only support bias shape of [1,1,1,skv], [1,1,sq,skv], [1,h,sq,skv], [b,1,sq,skv],"

" [b,h,sq,skv] for forward, and [1,1,sq,skv], [1,h,sq,skv], [b,1,sq,skv],"

" [b,h,sq,skv] for backward!"

)

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

greptile-apps

_{10 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T20:17:41Z

+                        ndim_bias = t.ndim
+                        seq_q_dim_bias = ndim_bias - 2  # Query sequence dimension
+                        # After reshaping both have shape: [..., 2, seq_q//2, seq_kv]
+                        # The split dimension is at seq_q_dim_bias
+                        slice_0 = [slice(None)] * ndim_bias
+                        slice_0[seq_q_dim_bias] = 0
+                        slice_1 = [slice(None)] * ndim_bias
+                        slice_1[seq_q_dim_bias] = 1


Wrong split dimension used in dbias comparison

After the reshape+index_select at lines 536–543, dbias has shape [..., 2, sq//(2*world_size), skv] (e.g. [1, H, 2, sq//4, skv] for 1hss). At that point t.ndim == 5, so seq_q_dim_bias = ndim_bias - 2 = 3.

But the dimension with exactly 2 elements (the CP split that should be compared chunk-by-chunk) is at index 2, not 3. Index 3 holds sq//(2*world_size) elements. The current code therefore slices along the inner sub-sequence dimension instead of the CP-half dimension, so errors confined to one CP half (dim 2 == 1) are never independently validated.

The fix is to use the same seq_q_dim value that was used during the earlier reshape (computed from the original ndim - 2 = 2):

Suggested change

ndim_bias = t.ndim

seq_q_dim_bias = ndim_bias - 2 # Query sequence dimension

# After reshaping both have shape: [..., 2, seq_q//2, seq_kv]

# The split dimension is at seq_q_dim_bias

slice_0 = [slice(None)] * ndim_bias

slice_0[seq_q_dim_bias] = 0

slice_1 = [slice(None)] * ndim_bias

slice_1[seq_q_dim_bias] = 1

if names[i] == "dbias":

# After reshaping both tensors have shape: [..., 2, seq_q//2, seq_kv]

# The CP-split dimension is at index seq_q_dim (= original ndim - 2 = 2)

split_dim = ndim - 2 # original ndim before reshape, i.e. 2 for [B,H,sq,skv]

slice_0 = [slice(None)] * t.ndim

slice_0[split_dim] = 0

slice_1 = [slice(None)] * t.ndim

slice_1[split_dim] = 1

Good catch by greptile
Was comparing the dim 3/-2 of the 5d tensor instead of dim 2/-3 of the tensor
Re ran the test and all pass locally

greptile-apps · 2026-02-17T20:17:42Z

+                        ndim_bias = t.ndim
+                        seq_q_dim_bias = ndim_bias - 2
+                        slice_0 = [slice(None)] * ndim_bias
+                        slice_0[seq_q_dim_bias] = 0
+                        slice_1 = [slice(None)] * ndim_bias
+                        slice_1[seq_q_dim_bias] = 1


Same wrong split dimension for sbhd format

This block is a copy of the bshd block above and has the same issue: seq_q_dim_bias = ndim_bias - 2 resolves to dimension 3 of the post-reshape tensor (which has sq//2 elements), while the CP-split dimension (2 elements) is at dimension 2.

The fix mirrors the one for bshd: capture the original ndim - 2 (== 2 for a 4-D bias) and use it as the split axis, rather than recomputing from the already-expanded t.ndim.

Suggested change

ndim_bias = t.ndim

seq_q_dim_bias = ndim_bias - 2

slice_0 = [slice(None)] * ndim_bias

slice_0[seq_q_dim_bias] = 0

slice_1 = [slice(None)] * ndim_bias

slice_1[seq_q_dim_bias] = 1

if names[i] == "dbias":

split_dim = ndim - 2 # original ndim before reshape, i.e. 2 for [B,H,sq,skv]

slice_0 = [slice(None)] * t.ndim

slice_0[split_dim] = 0

slice_1 = [slice(None)] * t.ndim

slice_1[split_dim] = 1

Good catch by greptile
Was comparing the dim 3/-2 of the 5d tensor instead of dim 2/-3 of the tensor
Re ran the test and all pass locally

greptile-apps · 2026-02-17T20:17:43Z

+        dBias = mha_graph->tensor(
+            fe::graph::Tensor_attributes()
+                .set_name("dBias")
+                .set_dim({bias_b, bias_h, bias_sq, bias_skv})
+                .set_stride({bias_h * bias_sq * bias_skv, bias_sq * bias_skv, bias_skv, 1}));
        sdpa_backward_options.set_bias(bias);
-        // shapes [1, 1, s, s], [b, 1, s, s], [b, h, s, s]
-        // are not supported for dbias calculation but they are
-        // supported for forward bias calculation
-        if ((bias_b == 1) && (bias_h == h)) {
+        // bias shapes [1, 1, s, s], [b, 1, s, s], [b, h, s, s], [1, h, s, s] are supported for dbias calculation
+        // bias shape [1, 1, 1, s] is not supported for dbias calculation as of cuDNN 9.18
+        if (!((bias_b == 1) && (bias_h == 1) && (bias_sq == 1))) {
          sdpa_backward_options.set_dbias(dBias);
        }
      }


Unused dBias graph tensor for 111s shape

When bias_sq == 1 (the 111s case), dBias is created as a graph tensor and added to the cache tuple, but set_dbias(dBias) is never called and variant_pack[dBias] = nullptr is set at execution time. This leaves a disconnected tensor node in the cuDNN FE graph.

In practice the code path where 111s reaches the backward pass is blocked by is_training = False in all callers, so there is no runtime risk today. However, if that guard is ever relaxed the graph build / validation could fail. Consider skipping the dBias tensor creation for the unsupported case:

if (is_bias) { bias = mha_graph->tensor(/* ... */); sdpa_backward_options.set_bias(bias); // [1,1,1,s] is not supported for dbias as of cuDNN 9.18 if (!((bias_b == 1) && (bias_h == 1) && (bias_sq == 1))) { dBias = mha_graph->tensor(/* same dims */); sdpa_backward_options.set_dbias(dBias); } }

This would also simplify the variant_pack assignment below.

Unrelated to this PR, however, I think it is a good change to create the tensor as needed.
Added a change for this

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

greptile-apps

_{10 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-18T02:09:24Z

+    "cp_1_4": ModelConfig(
+        2, 4096, 12, 128, attn_bias_type="post_scale_bias", bias_shape="bhss"
+    ),  # MHA
+    "cp_1_5": ModelConfig(2, 4096, 12, 128, attn_mask_type="causal", window_size=(512, 0)),  # MHA


SWA test coverage reduced

The old cp_1_4 (now cp_1_5) had window_size=(512, 512) which tested bidirectional sliding window attention. This was changed to window_size=(512, 0) (left-only SWA). The same applies to the old cp_2_4 (now cp_2_6) at line 190. Was this intentional? If so, there's no longer any fused attention CP test covering window_size_right > 0.

This is a good catch by greptile as well
Seems like it was altered during testing and never reinstated :(
Done in ff174a8

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

greptile-apps

_{10 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

KshitijLakhani · 2026-02-18T02:51:02Z

/te-ci L0 L1

…rt for additional bias shapes (#2537) * Plumbing correct bias dims from TE to cudnn Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Make changes for cp bias code Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add dbias and dbias_ to run_dpa_with_cp test Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: Use output_dBias instead of input_dBias to extract the shape Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add guards for bias/bias_/dbias/dbias_ being None Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for bias shape 111s in addition to the original 1hss, 11ss, b1ss and bhss Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for dbias calculation and variant packing for the dbias shapes b1ss, bhss, 11ss in addition to the already supported 1hss Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for 111s bias shape in DPA Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Allow fused attn for dbias calculation for 11ss, b1ss, bhss. Disable fused attn if dbias calculation for 111s is required, else enable Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Disable requires_grad for bias for shape 111s in tests Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Disable bias grad / training flag for 111s bias in the non-CP attn tests. Add bias shape 111s to test_dpa_bias_shapes Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Fix to correctly create the bias shape tensor instead of the hard coded shape. Fix the comparison logic shapes for bias/dbias Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add fused attn cp test cases for all supported bias shapes Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * nit: switch to elif for bias grad conditional Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add CP support for bias/dbias shape 111s Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for is_training in CP attn tests Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * nit: Fix incorrect comment Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * nit: Fix incorrect comment and assert string Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Create the dbias graph tensor only if it is a cuDNN supported bias shape Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Fix the dim that is being compared for the two cp chunks in the test Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * nit: Reinstate the original test for right side swa Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> --------- Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…rt for additional bias shapes (NVIDIA#2537) * Plumbing correct bias dims from TE to cudnn Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Make changes for cp bias code Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add dbias and dbias_ to run_dpa_with_cp test Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: Use output_dBias instead of input_dBias to extract the shape Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add guards for bias/bias_/dbias/dbias_ being None Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for bias shape 111s in addition to the original 1hss, 11ss, b1ss and bhss Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for dbias calculation and variant packing for the dbias shapes b1ss, bhss, 11ss in addition to the already supported 1hss Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for 111s bias shape in DPA Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Allow fused attn for dbias calculation for 11ss, b1ss, bhss. Disable fused attn if dbias calculation for 111s is required, else enable Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Disable requires_grad for bias for shape 111s in tests Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Disable bias grad / training flag for 111s bias in the non-CP attn tests. Add bias shape 111s to test_dpa_bias_shapes Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Fix to correctly create the bias shape tensor instead of the hard coded shape. Fix the comparison logic shapes for bias/dbias Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add fused attn cp test cases for all supported bias shapes Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * nit: switch to elif for bias grad conditional Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add CP support for bias/dbias shape 111s Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Add support for is_training in CP attn tests Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * nit: Fix incorrect comment Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * nit: Fix incorrect comment and assert string Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Create the dbias graph tensor only if it is a cuDNN supported bias shape Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * Fix the dim that is being compared for the two cp chunks in the test Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> * nit: Reinstate the original test for right side swa Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> --------- Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>

KshitijLakhani self-assigned this Dec 20, 2025

KshitijLakhani added the attention label Dec 20, 2025

KshitijLakhani force-pushed the klakhani/fix/bias-shape branch from 200fd98 to 8da3252 Compare December 22, 2025 18:21

KshitijLakhani marked this pull request as ready for review December 22, 2025 18:24

KshitijLakhani changed the title ~~Plumbing correct bias dims from TE to cudnn~~ [PyT] Plumbing correct bias dims from TE to cudnn Dec 22, 2025

KshitijLakhani added bug Something isn't working pytorch labels Dec 22, 2025

KshitijLakhani requested a review from cyanguwa December 22, 2025 18:43

greptile-apps Bot reviewed Jan 9, 2026

View reviewed changes

cyanguwa reviewed Jan 9, 2026

View reviewed changes

KshitijLakhani force-pushed the klakhani/fix/bias-shape branch from 11c7107 to de3011e Compare January 21, 2026 19:41

KshitijLakhani added 2.12.0 and removed 2.12.0 labels Jan 22, 2026

KshitijLakhani changed the title ~~[PyT] Plumbing correct bias dims from TE to cudnn~~ [PyT] Plumbing correct bias dims from TE to cudnn, while adding support for additional bias shapes Feb 6, 2026

KshitijLakhani force-pushed the klakhani/fix/bias-shape branch 2 times, most recently from ab1d2a9 to 8147617 Compare February 6, 2026 17:57

greptile-apps Bot reviewed Feb 6, 2026

View reviewed changes

Plumbing correct bias dims from TE to cudnn

ab542fc

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

KshitijLakhani and others added 4 commits February 13, 2026 13:44

nit: switch to elif for bias grad conditional

f795056

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

Add CP support for bias/dbias shape 111s

0e74dcf

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

Add support for is_training in CP attn tests

0acf8f8

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

2133bd8

for more information, see https://pre-commit.ci

greptile-apps Bot reviewed Feb 13, 2026

View reviewed changes

KshitijLakhani requested a review from cyanguwa February 13, 2026 23:12

KshitijLakhani removed the bug Something isn't working label Feb 13, 2026

cyanguwa reviewed Feb 14, 2026

View reviewed changes

nit: Fix incorrect comment

89e90a5

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

KshitijLakhani requested a review from cyanguwa February 17, 2026 19:32

cyanguwa previously approved these changes Feb 17, 2026

View reviewed changes

greptile-apps Bot reviewed Feb 17, 2026

View reviewed changes

nit: Fix incorrect comment and assert string

5a25d9c

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

KshitijLakhani dismissed cyanguwa’s stale review via 5a25d9c February 17, 2026 20:08

KshitijLakhani requested a review from cyanguwa February 17, 2026 20:16

greptile-apps Bot reviewed Feb 17, 2026

View reviewed changes

KshitijLakhani added 2 commits February 17, 2026 17:00

Create the dbias graph tensor only if it is a cuDNN supported bias shape

0e2a72f

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

Fix the dim that is being compared for the two cp chunks in the test

f066c88

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

greptile-apps Bot reviewed Feb 18, 2026

View reviewed changes

nit: Reinstate the original test for right side swa

ff174a8

Signed-off-by: Kshitij Lakhani <klakhani@nvidia.com>

greptile-apps Bot reviewed Feb 18, 2026

View reviewed changes

cyanguwa approved these changes Feb 18, 2026

View reviewed changes

KshitijLakhani merged commit 2d0d276 into NVIDIA:main Feb 18, 2026
48 of 54 checks passed

KshitijLakhani deleted the klakhani/fix/bias-shape branch February 18, 2026 23:47

hungryGeek16 mentioned this pull request May 31, 2026

fix unfused padding causal sdpa #3063

Open

	dq, dk, dv, dbias = q.grad, k.grad, v.grad, bias.grad
	dq, dk, dv, dbias = q.grad, k.grad, v.grad, bias.grad if bias is not None else None

	dq_, dk_, dv_, dbias_ = q_.grad, k_.grad, v_.grad, bias_.grad
	dq_, dk_, dv_, dbias_ = q_.grad, k_.grad, v_.grad, bias_.grad if bias_ is not None else None

-        if not fu_core_attention_bias_requires_grad:
-            # max512 backend will only support [1, h, s, s]
-            os.environ["NVTE_FUSED_ATTN_BACKEND"] = "1"
+        # dbias calculation is not supported for 111s as of cuDNN 9.18. So, use fused attention backend only if bias does not require grad.
+        if fu_core_attention_bias_requires_grad and fu_core_attention_bias_shape == "111s":
+            logger.warning("Disabling FusedAttention as dbias calculation is not supported for 111s")
+            use_fused_attention = False
+        elif not fu_core_attention_bias_requires_grad:

	# For bias shape 111s, only skv must be divisible by 2
	# For all bias shapes except 111s, sq must be divisible by 2 and skv must be divisible by 2*cp_size
	# For bias shape 111s, only skv must be divisible by 2*cp_size

Uh oh!

Conversation

KshitijLakhani commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Changes

Testing:

Supplementary testing:

Checklist:

Uh oh!

KshitijLakhani commented Dec 21, 2025

Uh oh!

greptile-apps Bot commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot commented Dec 22, 2025

Greptile's behavior is changing!

Uh oh!

cyanguwa commented Dec 22, 2025

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

KshitijLakhani commented Jan 9, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Major Changes

Issues Found

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KshitijLakhani commented Jan 21, 2026

Uh oh!

KshitijLakhani commented Jan 22, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Feb 6, 2026

Uh oh!

greptile-apps Bot left a comment

KshitijLakhani commented Dec 20, 2025 •

edited

Loading

greptile-apps Bot commented Dec 22, 2025 •

edited

Loading

cyanguwa Feb 14, 2026 •

edited

Loading