Skip to content

Conversation

@Fan-Yunfan
Copy link
Contributor

@Fan-Yunfan Fan-Yunfan commented Nov 1, 2025

Problem

cpp/include/tensorrt_llm/runtime/iTensor.h

  1. In the getDimension() method, the static_assert used for compile-time checking utilizes the static constant attribute MAX_DIMS of the runtime object "shape", which may cause confusion for readers: why can "shape", clearly a runtime object, be used at compile time?
    using Shape = nvinfer1::Dims;
    ......
    [[nodiscard]] virtual Shape const& getShape() const = 0;
    ......
    template <SizeType32 n>
    [[nodiscard]] DimType64 getDimension() const
    {
        auto const shape = getShape();
        static_assert(n < shape.MAX_DIMS && n >= -shape.MAX_DIMS,
            "Trying to access the dimension of a tensor, when its maximal shape cannot have that dimension.");
        if constexpr (n < 0)
        {
            return shape.d[shape.nbDims + n];
        }
        else
        {
            return shape.d[n];
        }
    }

TensorRT/include/NvInferRuntimeBase.h

//!
//! Alias for Dims64.
//!
using Dims = Dims64;
class Dims64
{
public:
    //! The maximum rank (number of dimensions) supported for a tensor.
    static constexpr int32_t MAX_DIMS{8};

    //! The rank (number of dimensions).
    int32_t nbDims;

    //! The extent of each dimension.
    int64_t d[MAX_DIMS];
};
  1. In the volumeNonNegative() method, the data type of "vol" is std::int64_t, but it is converted to the std::size_t type when returned. On 32-bit platforms, converting std::int64_t to std::size_t may pose an overflow risk.
static std::int64_t volume(Shape const& dims)
{
    {
        return dims.nbDims < 0 ? -1
            : dims.nbDims == 0
            ? 0
            : std::accumulate(dims.d, dims.d + dims.nbDims, std::int64_t{1}, std::multiplies<>{});
    }
}
......
static std::size_t volumeNonNegative(Shape const& shape)
{
    auto const vol = volume(shape);
    TLLM_CHECK_WITH_INFO(0 <= vol, "Invalid tensor shape");
    return static_cast<std::size_t>(vol);
}

Solution

  1. Replace static member access through object instances (which appears to be runtime but is optimized to compile-time by the compiler) with class type scope access (compile-time)
template <SizeType32 n>
[[nodiscard]] DimType64 getDimension() const
{
    auto const shape = getShape();
    static_assert(n < Shape::MAX_DIMS && n >= -Shape::MAX_DIMS,
        "Trying to access the dimension of a tensor, when its maximal shape cannot have that dimension.");
    ......
}
  1. Add boundary checks for 32-bit platforms
if constexpr (sizeof(std::size_t) == 4)
{
    TLLM_CHECK_WITH_INFO(vol <= static_cast<std::int64_t>(std::numeric_limits<std::size_t>::max(),
                             "Tensor volume exceeds 32-bit size_t maximum capacity."));
}

Summary by CodeRabbit

  • Bug Fixes
    • Improved handling of edge cases in tensor dimension calculations to prevent overflow on 32-bit platforms.
    • Enhanced error detection for out-of-range values to ensure more robust operation.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 1, 2025

📝 Walkthrough

Walkthrough

Updates ITensor interface in a single header file: fixes static assertion reference to Shape::MAX_DIMS and adds 32-bit size_t overflow validation in the volumeNonNegative method.

Changes

Cohort / File(s) Summary
ITensor interface updates
cpp/include/tensorrt_llm/runtime/iTensor.h
Fixed static_assert reference in getDimension method from shape.MAX_DIMS to Shape::MAX_DIMS; added 32-bit size_t overflow check in volumeNonNegative method to validate volume fits within std::size_t.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • Verify static_assert reference correction is syntactically valid and semantically correct
  • Confirm overflow check logic is appropriate for 32-bit platform compatibility

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description provides detailed problem and solution information, but it significantly deviates from the required template structure. The author provided "Problem" and "Solution" sections with technical details, but failed to include the required "Description" section, completely omitted the "Test Coverage" section (which should explain how the changes are tested), and did not include the PR Checklist confirmation. While the technical content is substantive and on-topic, the missing critical template sections—particularly test coverage information and the required checklist confirmation—represent a major gap in following the repository's contribution guidelines. The author should restructure the description to follow the template: add a clear "Description" section summarizing the issue and solution, add a "Test Coverage" section documenting what tests safeguard these changes (especially tests for the 32-bit overflow check and the static_assert change), and finally review and confirm the PR Checklist items by checking the final confirmation box. This will ensure the PR meets the repository's documentation standards and provides reviewers with complete information about test coverage.
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "[None][fix] Enhancing code robustness and adding boundary checks for ITensor" follows the required template format with the ticket identifier [None] and type [fix], and it clearly summarizes both main changes in the pull request: the static_assert update for clarity and the addition of 32-bit overflow boundary checks. The title is concise, specific, and avoids vague phrasing, accurately reflecting the changes to the ITensor header file.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/include/tensorrt_llm/runtime/iTensor.h (1)

2-2: Update copyright year to include 2025.

The file is being modified in 2025, so the copyright year range should be updated.

Apply this diff:

- * Copyright (c) 2022-2024, NVIDIA CORPORATION.  All rights reserved.
+ * Copyright (c) 2022-2025, NVIDIA CORPORATION.  All rights reserved.

As per coding guidelines.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d798d66 and fbc88cc.

📒 Files selected for processing (1)
  • cpp/include/tensorrt_llm/runtime/iTensor.h (2 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh}: Namespace closing braces must include a trailing comment with the namespace name (e.g., '} // namespace foo').
Prefer const or constexpr variables over #define for constants.
Declare variables that are not modified after initialization as const.
Avoid magic literals in code; except for 0, nullptr, true, false. Use named constants for comparisons and logic.
Use Allman brace style for formatting.
Place the semicolon of an empty for/while loop on a new line.
Bodies of switch/while/do-while/for must be compound statements (brace-delimited), and if/else must always be followed by brace-delimited statements.
Type names (e.g., classes) must be CamelCase starting with an uppercase letter (e.g., FooBar).
Local variables, methods, and namespaces use lowerCamelCase (e.g., localFooBar).
Non-magic-number global variables that are non-static and not in an anonymous namespace must be lowerCamelCase prefixed with 'g' (e.g., gDontUseGlobalFoos).
Non-magic-number globals that are static or in an anonymous namespace use lowerCamelCase prefixed with 's' (e.g., sMutableStaticGlobal).
Locally visible static variables use lowerCamelCase with 's' prefix (e.g., static std::once_flag sFlag).
Private/protected member variables use 'm' prefix with CamelCase (e.g., mNbFooValues). Public members may omit, but 'm' is encouraged for clarity.
Constants (enums, global constants, static constants, and function-scope magic/literal constants) use uppercase SNAKE_CASE with 'k' prefix (e.g., kDIGIT_NUM).
Function-scope constants that are not magic numbers or literals are named like non-constant variables (e.g., bool const pass = a && b).
If macros are necessary, name them in UPPER_SNAKE_CASE (e.g., FOO_VERSION) and prefer constants over #define.
Use LLVM clang-format; wrap lines at a maximum of 120 columns; use '// clang-format off/on' sparingly with justification.
Use smart pointers for heap allocations; prefer unique_ptr for sole ownership, shared_ptr for shared...

Files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
**/*.{cpp,cxx,cc,cu,h,hpp,hh,hxx,cuh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

C++ filenames should be lowerCamelCase (first letter lowercase) and must be case-insensitive unique within a compilation target.

Files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use only spaces, no tabs; indent with 4 spaces.

Files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
**/*.{h,hpp,hh,hxx}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Document new class interfaces and function prototypes with Doxygen; use //! for single-line and //!< for members.

Files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
**/*.{h,hpp,hh,hxx,cpp,cxx,cc}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.{h,hpp,hh,hxx,cpp,cxx,cc}: Prefer anonymous namespaces over 'static' for internal linkage of functions.
All templates (class/function/member/static) must be instantiated at least once; non-POD classes should have private data members.

Files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
**/*.{h,hpp,hh,hxx,cuh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Use include guards named 'TRTLLM_<FILE_NAME_IN_CAPS_WITH_UNDERSCORES>_H' (no leading or trailing underscore; directory names excluded).

Files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).

Files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
🧠 Learnings (3)
📓 Common learnings
Learnt from: jhaotingc
Repo: NVIDIA/TensorRT-LLM PR: 7856
File: cpp/tensorrt_llm/thop/fp8BlockScaleMoe.cpp:159-166
Timestamp: 2025-09-19T21:28:13.751Z
Learning: In TensorRT-LLM blockScaleMoe routing (cpp/tensorrt_llm/kernels/trtllmGenKernels/blockScaleMoe/runner.cu), the DeepSeek routing method performs reinterpret_cast<float*>(routingLogits) at line 89, which could cause issues if routing_logits are BF16. However, Qwen3-FP8 models use RenormalizeNaive routing method and are not affected by this dtype casting issue.
📚 Learning: 2025-09-23T15:13:48.819Z
Learnt from: nv-lschneider
Repo: NVIDIA/TensorRT-LLM PR: 7910
File: cpp/tensorrt_llm/kernels/nccl_device/multimem.h:20-30
Timestamp: 2025-09-23T15:13:48.819Z
Learning: TRT-LLM targets modern CUDA toolkits that support FP8 datatypes, so cuda_fp8.h can be included unconditionally without version guards in TRT-LLM code.

Applied to files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
📚 Learning: 2025-08-20T06:56:02.889Z
Learnt from: eopXD
Repo: NVIDIA/TensorRT-LLM PR: 6768
File: cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp:577-579
Timestamp: 2025-08-20T06:56:02.889Z
Learning: In cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp, maxSequenceLength is now enforced as a non-optional argument in the BlockManager constructor, so concerns about std::nullopt defaulting to 0 are not applicable. When windowSize > maxSequenceLength, a warning should be added instead of handling optional parameter cases.

Applied to files:

  • cpp/include/tensorrt_llm/runtime/iTensor.h
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (1)
cpp/include/tensorrt_llm/runtime/iTensor.h (1)

74-74: LGTM! Improved clarity for static constant access.

Accessing MAX_DIMS via Shape::MAX_DIMS instead of shape.MAX_DIMS makes it explicit that this is a compile-time constant accessed at class scope, not through a runtime instance.

@svc-trtllm-gh-bot svc-trtllm-gh-bot added the Community want to contribute PRs initiated from Community label Nov 1, 2025
@Fan-Yunfan Fan-Yunfan force-pushed the fyf_improve_itensor_robustness branch from fbc88cc to d559b88 Compare November 1, 2025 08:31
@Fan-Yunfan
Copy link
Contributor Author

Fan-Yunfan commented Nov 1, 2025

Dear @karljang, I've made some code quality improvements for ITensor. If the review looks good, we can squash the commits from this PR and #8140 into a single commit to reduce the validation pressure on our CI pipeline.
9a3430c5-d2d0-4316-9128-47c2cc007efc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community want to contribute PRs initiated from Community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants