feat: preserve debug name hints in ptoas emitc#658
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements variable name preservation from PTO IR to the generated C++ code to improve debuggability. It introduces a design document, test cases, and logic in ptoas to extract name hints from Location metadata, annotate the IR, and post-process the C++ output to replace generic variable names with hinted ones. Feedback includes addressing potential name collisions by populating the used names set, optimizing keyword lookups using llvm::StringSwitch, improving the performance of marker stripping from O(N^2) to O(N), and replacing magic numbers with named constants.
| static void rewriteNameHintMarkers(std::string &cpp) { | ||
| constexpr llvm::StringLiteral kMarkerPrefix = "/* PTOAS_NAME_HINTS:"; | ||
| llvm::StringMap<std::string> replacements; | ||
| std::set<std::string> usedNames; |
There was a problem hiding this comment.
The usedNames set is currently empty when renaming starts. This means that if a name hint conflicts with an existing identifier in the C++ code (such as a function parameter, a global variable, or a local variable that didn't have a hint), makeUniqueCppIdentifier will not detect the collision. You should populate usedNames with all existing identifiers in the cpp string before starting the renaming process.
| static bool isReservedCppIdentifier(llvm::StringRef name) { | ||
| static const std::set<std::string> kReserved = { | ||
| "alignas", "alignof", "asm", "auto", "bool", | ||
| "break", "case", "catch", "char", "char8_t", | ||
| "char16_t", "char32_t", "class", "const", "consteval", | ||
| "constexpr", "constinit", "const_cast","continue", "co_await", | ||
| "co_return", "co_yield", "decltype", "default", "delete", | ||
| "do", "double", "dynamic_cast", "else", "enum", | ||
| "explicit", "export", "extern", "false", "float", | ||
| "for", "friend", "goto", "if", "inline", | ||
| "int", "long", "mutable", "namespace", "new", | ||
| "noexcept", "nullptr", "operator", "private", "protected", | ||
| "public", "register", "reinterpret_cast", "requires", | ||
| "return", "short", "signed", "sizeof", "static", | ||
| "static_assert", "static_cast", "struct", "switch", "template", | ||
| "this", "thread_local", "throw", "true", "try", | ||
| "typedef", "typeid", "typename", "union", "unsigned", | ||
| "using", "virtual", "void", "volatile", "wchar_t", | ||
| "while"}; | ||
| return kReserved.count(name.str()) != 0; | ||
| } |
There was a problem hiding this comment.
Using std::set<std::string> for keyword lookup is functional but less efficient than using llvm::StringSwitch. In LLVM-based tools, StringSwitch is more idiomatic and provides better performance for small sets of strings.
static bool isReservedCppIdentifier(llvm::StringRef name) {
return llvm::StringSwitch<bool>(name)
.Cases("alignas", "alignof", "asm", "auto", "bool", true)
.Cases("break", "case", "catch", "char", "char8_t", true)
.Cases("char16_t", "char32_t", "class", "const", "consteval", true)
.Cases("constexpr", "constinit", "const_cast", "continue", "co_await", true)
.Cases("co_return", "co_yield", "decltype", "default", "delete", true)
.Cases("do", "double", "dynamic_cast", "else", "enum", true)
.Cases("explicit", "export", "extern", "false", "float", true)
.Cases("for", "friend", "goto", "if", "inline", true)
.Cases("int", "long", "mutable", "namespace", "new", true)
.Cases("noexcept", "nullptr", "operator", "private", "protected", true)
.Cases("public", "register", "reinterpret_cast", "requires", true)
.Cases("return", "short", "signed", "sizeof", "static", true)
.Cases("static_assert", "static_cast", "struct", "switch", "template", true)
.Cases("this", "thread_local", "throw", "true", "try", true)
.Cases("typedef", "typeid", "typename", "union", "unsigned", true)
.Cases("using", "virtual", "void", "volatile", "wchar_t", true)
.Case("while", true)
.Default(false);
}| static void stripNameHintMarkers(std::string &cpp) { | ||
| constexpr llvm::StringLiteral kMarkerPrefix = "/* PTOAS_NAME_HINTS:"; | ||
| size_t searchPos = 0; | ||
| while (true) { | ||
| size_t markerPos = cpp.find(kMarkerPrefix.str(), searchPos); | ||
| if (markerPos == std::string::npos) | ||
| break; | ||
|
|
||
| size_t markerEnd = cpp.find("*/", markerPos + kMarkerPrefix.size()); | ||
| if (markerEnd == std::string::npos) | ||
| break; | ||
| markerEnd += 2; | ||
| while (markerEnd < cpp.size() && | ||
| (cpp[markerEnd] == '\r' || cpp[markerEnd] == '\n')) | ||
| ++markerEnd; | ||
|
|
||
| cpp.erase(markerPos, markerEnd - markerPos); | ||
| searchPos = markerPos; | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation of stripNameHintMarkers has std::string::erase shifts all subsequent characters on every call. For large generated C++ files, this can become a performance bottleneck. It is better to perform the stripping in a single
| if (!hints) | ||
| continue; | ||
|
|
||
| size_t windowEnd = std::min(searchPos + static_cast<size_t>(2048), |
There was a problem hiding this comment.
The search window size 2048 is a magic number. While likely sufficient for most cases, it should be defined as a named constant (e.g., kSearchWindowSize) to improve maintainability and clarify the intent.
| size_t windowEnd = std::min(searchPos + static_cast<size_t>(2048), | |
| constexpr size_t kSearchWindowSize = 2048; | |
| size_t windowEnd = std::min(searchPos + kSearchWindowSize, cpp.size()); |
Codex Review该评论由 review 机器人自动更新。
Summary检查到 2 个 P2 问题:多结果 textual SSA 名称不会被保留,且 Qwen validation 的 scalar 默认值兼容层不完整。 Findings
这个 PR 现在会把 textual 函数参数名保留下来,所以 Qwen case 的参数名会从旧的 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca1265fd6c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (size_t commentPos = line.find("//"); commentPos != llvm::StringRef::npos) | ||
| body = line.take_front(commentPos); |
There was a problem hiding this comment.
Parse MLIR comments without truncating string literals
stripMlirLineComments removes everything after the first // on each line, even when // appears inside a quoted MLIR string (e.g. location metadata like loc("a//b")). Because extractTextualNameHints depends on this preprocessed text for function/SSA hint alignment, valid textual IR containing // in string literals can produce misaligned or missing hints and incorrect downstream renaming. This is a regression in the new textual-hint pipeline and should use a comment scanner that ignores // within string literals.
Useful? React with 👍 / 👎.
| if (trimmed.starts_with("using ")) | ||
| break; |
There was a problem hiding this comment.
Continue scanning declarations after top-level using aliases
findTopLevelGeneratedDeclarations stops at the first using ... line, so functions that begin with a type alias never collect later hoisted generated declarations. collectPendingIdentifierRenames relies on this list to map CFG/block-arg hints back onto hoisted vN temporaries, so in any function with both a leading using and block-arg hints, the block-arg rename path is silently disabled. This breaks the stated name-preservation behavior for a common EmitC function prologue pattern.
Useful? React with 👍 / 👎.
|
/run a3 |
|
已接收
页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。 |
A3 板测失败
失败用例
|
A3 板测失败详情:PR #658qwen3_decode_incore_4
qwen3_decode_incore_3
qwen3_decode_incore_8
qwen3_decode_incore_0
qwen3_decode_incore_1
qwen3_decode_incore_10
qwen3_decode_incore_14
qwen3_decode_incore_11
qwen3_decode_incore_9
qwen3_decode_incore_6
qwen3_decode_incore_2
qwen3_decode_incore_13
qwen3_decode_incore_16
qwen3_decode_incore_7
qwen3_decode_incore_5
qwen3_decode_incore_15
qwen3_decode_incore_12
|
|
/run a3 |
|
已接收
页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。 |
A3 板测完成(有跳过)
|
实现 issue #337 的调试命名提示链路:
本地验证: