-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[None][fix] Fix and enhance MemoryCounters Singleton with compile-time safety and bounds checking #8140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…rsion Signed-off-by: fanyunfan <[email protected]>
…ds checking Signed-off-by: fanyunfan <[email protected]>
📝 WalkthroughWalkthroughAdded overflow checks in MemoryCounters::allocate and ::deallocate that throw when size exceeds DiffType limits. Replaced a runtime error for unknown MemoryType with a compile-time static_assert, removing the runtime fallback path. No public API signatures changed. Changes are confined to cpp/include/tensorrt_llm/runtime/memoryCounters.h. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant MemoryCounters
Caller->>MemoryCounters: allocate(size)
alt size exceeds DiffType max
MemoryCounters-->>Caller: throw overflow_error
else valid size
MemoryCounters-->>Caller: update counters
end
Caller->>MemoryCounters: deallocate(size)
alt size exceeds DiffType max
MemoryCounters-->>Caller: throw overflow_error
else valid size
MemoryCounters-->>Caller: update counters
end
note over MemoryCounters: Unknown MemoryType<T> now fails at compile time via static_assert
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Dear @karljang , Would you like to help me review this pr when you have time? |
|
@Fan-Yunfan , |
…r MemoryCounters Singleton. Signed-off-by: fanyunfan <[email protected]>
b3b1f2f to
4511c59
Compare
Thanks for your correction!Using I have updated the commits. Thanks for let me learn about the concept and usage of std::false_type of C++.
|
|
/bot run |
|
PR_Github #21433 [ run ] triggered by Bot |
|
PR_Github #21433 [ run ] completed with state |
|
|
|
/bot run |
|
Just running it again, the errors look not related to this change. |
|
PR_Github #21718 [ run ] triggered by Bot. Commit: |
|
PR_Github #21718 [ run ] completed with state |
|
/bot run |
Signed-off-by: fanyunfan <[email protected]>
41c55be to
a676106
Compare
|
Oops, this slipped my mind, I'm rerunning the tests now |
|
/bot run |
|
PR_Github #23425 [ run ] triggered by Bot. Commit: |
|
PR_Github #23425 [ run ] completed with state |
karljang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM;
Haha, no worries~ I just dropped by when it crossed my mind. It doesn’t matter whether it’s early or late—just feel free to take a look whenever you have a moment. If you’re busy, just focus on your work first. I don’t have any specific requests~ |
MartinMarciniszyn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fan-Yunfan , thank you for your suggestions. I agree with the static_assert, but I am not convinced about the other changes. Please revert these.
Since you are editing this file, I suggest renaming SizeType32 to SizeType since the 32 is wrong and misleading. Many thanks for your help.
Thank you for your review—these were very helpful suggestions! I have already made the corresponding revisions based on your advice. Additionally, if the systems in focus are all 64-bit systems, I was wondering whether the 32-bit system check in another PR related to ITensor at #8855 might also be unnecessary? (I believe so, but it might require your confirmation~) |
…constraints Signed-off-by: fanyunfan <[email protected]>
1509d03 to
8c43e55
Compare



Problem
The
allocateanddeallocatetemplate functions in compile-time can determine the specified value of T, so the exception process macroTLLM_THROWwill never be invoke in runtime, it should be replaced with compile-time check such asstatic_assert.The
MemoryTypeString<T>template class don't have specified impl for unsupported type of T, so it don't havevaluemember. It may throw error: 'value' is not a member of 'MemoryTypeString<T>' .Lines
auto const sizeDiff = static_cast<DiffType>(size);andauto const sizeDiff = -static_cast<DiffType>(size);can overflow becauseSizeType32is an alias forstd::size_twhileDiffTypeisstd::ptrdiff_t.On a 32-bit platform, for example,
std::size_tspans [0 … 4 294 967 295] (2³²–1) butstd::ptrdiff_tonly covers [–2 147 483 648 … 2 147 483 647] (–2³¹ … 2³¹–1).Any size value larger than
PTRDIFF_MAXwill therefore be truncated, yielding an incorrect signed result.The current
MemoryCounterssingleton does not explicitly forbid copy and assignment operations, which is unsafe.Current Implementation
cpp/include/tensorrt_llm/runtime/memoryCounters.h
cpp/include/tensorrt_llm/runtime/iBuffer.h
Solution
static_assertto replaceTLLM_THROWand removeMemoryTypeString<T>::value.static_cast<DiffType>(size).MemoryCountersSingleton.