fix(ascend):update README by lywbarca · Pull Request #673 · UbiquitousLearning/mllm

lywbarca · 2026-05-06T04:39:17Z

Please check Guidelines for Contributing.

Summary by CodeRabbit

Release Notes

New Features
- Added support for Ascend NPU backend with Qwen3 W8A8 quantized inference.
Improvements
- Enhanced UTF-8 safe streaming output for more reliable text rendering.

coderabbitai · 2026-05-06T04:39:30Z

📝 Walkthrough

Walkthrough

This PR introduces UTF-8 safe streaming output for Qwen Ascend inference by adding prefill warmup, buffered token decoding, and UTF-8 validation logic. The tokenizer initializes with UTF-8 locale, and output is accumulated and printed only when valid UTF-8 sequences are complete. Documentation is updated to announce the feature.

Changes

UTF-8 Safe Streaming and Tokenizer Configuration

Layer / File(s)	Summary
Locale and Tokenizer Setup `mllm/models/qwen_ascend/tokenization_qwen_ascend.hpp`	QwenAscendTokenizer constructor now explicitly initializes the local environment with UTF-8 locale via `preprocessor::initLocal("C.UTF-8")`.
UTF-8 Validation Utilities `examples/qwen_ascend/main.cpp` (lines 6–38)	Anonymous namespace function `takeValidUtf8Prefix` extracts valid UTF-8 prefixes from partial byte sequences; include set expanded to support UTF-8 and string handling.
Buffered Generation and Output `examples/qwen_ascend/main.cpp` (lines 223–255)	QA generation path adds prefill warmup step, decodes tokens into `pending_text` buffer, validates and prints only complete UTF-8 chunks after each generation step, then ensures final newline.
Documentation `README.md`	Latest News announces Ascend NPU backend support with Qwen3 W8A8 inference; Qwen3-0.6B table entry for W8A8 converted to hyperlink pointing to model card.

Sequence Diagram

sequenceDiagram
    participant Gen as Token Generator
    participant Buf as UTF-8 Buffer
    participant Val as UTF-8 Validator
    participant Out as Output

    loop Generation Loop
        Gen->>Buf: Decode token → pending_text
        Buf->>Val: Check for valid UTF-8 prefix
        Val->>Buf: Return valid prefix + remainder
        Buf->>Out: Print valid UTF-8 chunk
        Buf->>Buf: Keep remainder for next iteration
    end
    Out->>Out: Print final newline

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

oreomaker
xumengwei
liang1232018
yirongjie

Poem

🐰 Streams of tokens, UTF-8 flow,
Buffered wisdom, safe and slow,
Qwen meets Ascend with locale care,
Valid prefixes float through air!
✨ hops

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (3 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title only mentions README update, but the changes significantly include UTF-8 streaming improvements in main.cpp and tokenization changes, making it incomplete and misleading.	Update title to reflect all major changes, e.g., 'fix(ascend): add UTF-8 safe streaming and update README for Qwen3 W8A8 support'
Description check	⚠️ Warning	The PR description contains only the template boilerplate with no actual description of changes, missing all details about the modifications made.	Add a meaningful description explaining the UTF-8 streaming improvements, prefill warmup step, locale initialization changes, and README updates with relevant context.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/qwen_ascend/main.cpp`:
- Around line 19-35: The helper takeValidUtf8Prefix currently returns empty when
an invalid byte is at the buffer start and never consumes it; update
takeValidUtf8Prefix to detect irrecoverable leading bytes (e.g., a start octet
that would require >4 continuation bytes or otherwise cannot form a valid UTF‑8
sequence) and consume/discard one invalid byte so progress continues, while
still returning empty when nothing valid precedes it; also ensure the final
flush path uses takeValidUtf8Prefix on pending_text (instead of printing
pending_text directly) so only validated UTF‑8 is emitted. Reference: function
takeValidUtf8Prefix and the final tail-flush that prints pending_text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e278d7c-fbd2-4317-b44b-90eb3e3d5fd7

📥 Commits

Reviewing files that changed from the base of the PR and between 729ca4c and 3973c6a.

📒 Files selected for processing (3)

README.md
examples/qwen_ascend/main.cpp
mllm/models/qwen_ascend/tokenization_qwen_ascend.hpp

coderabbitai · 2026-05-06T04:44:51Z

+std::string takeValidUtf8Prefix(std::string& pending_text) {
+  auto invalid = utf8::find_invalid(pending_text.begin(), pending_text.end());
+  if (invalid == pending_text.begin()) {
+    return {};
+  }
+
+  if (invalid == pending_text.end()) {
+    std::string ready_text;
+    ready_text.swap(pending_text);
+    return ready_text;
+  }
+
+  auto ready_bytes = static_cast<size_t>(std::distance(pending_text.begin(), invalid));
+  auto ready_text = pending_text.substr(0, ready_bytes);
+  pending_text.erase(0, ready_bytes);
+  return ready_text;
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In utf8cpp, what does utf8::find_invalid(begin, end) return for (1) an incomplete multi-byte sequence at buffer start and (2) an irrecoverably invalid leading byte?

💡 Result:

For utf8::find_invalid(begin, end), the function returns an iterator pointing to the first invalid octet; if none are found it returns end. It does this by scanning from start and calling utf8::internal::validate_next(result, end) and immediately returning result when the returned error code is not UTF8_OK. [1] (1) Incomplete multi-byte sequence at buffer start: It returns begin (i.e., the iterator to the first byte of the incomplete sequence). validate_next is invoked starting from result==start, and any non-UTF8_OK (including INCOMPLETE_SEQUENCE) causes find_invalid to return the current iterator (the start of that invalid sequence). [1][2] (2) Irrecoverably invalid leading byte: It returns an iterator pointing to that invalid leading byte (the first byte whose validate_next call fails with a non-UTF8_OK error such as INVALID_LEAD). Since find_invalid returns the iterator value at which validate_next fails, it is the position of the bad leading octet. [1][2]

Citations:

1: https://github.com/nemtrif/utfcpp/blob/master/source/utf8/core.h

2: https://github.com/ledger/utfcpp/blob/master/source/utf8/checked.h

🏁 Script executed:

# Check if the file exists and get its size wc -l examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 101

🏁 Script executed:

# View the takeValidUtf8Prefix helper (lines 19-35) sed -n '19,35p' examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 604

🏁 Script executed:

# View the tail flush logic (lines 250-260 for context around 252-254) sed -n '250,260p' examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 451

🏁 Script executed:

# Check how pending_text is used in the loop - search for the streaming context sed -n '240,270p' examples/qwen_ascend/main.cpp

Repository: UbiquitousLearning/mllm

Length of output: 981

Handle irrecoverable leading bytes to avoid stalled output and invalid tail flush.

When utf8::find_invalid detects an invalid byte at the buffer start, the helper returns empty without consuming it. An irrecoverably invalid leading byte (e.g., a malformed UTF-8 start octet) will therefore block progress indefinitely—the next iteration finds the same byte, returns empty again, and loops without advancing. Additionally, the final flush at line 252-254 directly prints pending_text without validation, allowing invalid UTF-8 to reach output even though the streaming loop uses the validated takeValidUtf8Prefix.

Discard irrecoverable leading bytes that exceed the maximum UTF-8 sequence length (4 bytes), and apply takeValidUtf8Prefix to the tail flush to ensure output is valid UTF-8.

Proposed fix

std::string takeValidUtf8Prefix(std::string& pending_text) { + if (pending_text.empty()) { + return {}; + } auto invalid = utf8::find_invalid(pending_text.begin(), pending_text.end()); if (invalid == pending_text.begin()) { - return {}; + // Keep short prefixes that may become valid with future bytes. + // UTF-8 max code point width is 4 bytes; longer invalid-at-begin likely means malformed lead. + if (pending_text.size() <= 4) { + return {}; + } + pending_text.erase(0, 1); + return {}; } @@ - if (!pending_text.empty()) { - fmt::print("{}", pending_text); - } + auto tail = takeValidUtf8Prefix(pending_text); + if (!tail.empty()) { + fmt::print("{}", tail); + }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/qwen_ascend/main.cpp` around lines 19 - 35, The helper takeValidUtf8Prefix currently returns empty when an invalid byte is at the buffer start and never consumes it; update takeValidUtf8Prefix to detect irrecoverable leading bytes (e.g., a start octet that would require >4 continuation bytes or otherwise cannot form a valid UTF‑8 sequence) and consume/discard one invalid byte so progress continues, while still returning empty when nothing valid precedes it; also ensure the final flush path uses takeValidUtf8Prefix on pending_text (instead of printing pending_text directly) so only validated UTF‑8 is emitted. Reference: function takeValidUtf8Prefix and the final tail-flush that prints pending_text.

fix(ascend):update README

3973c6a

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ascend):update README#673

fix(ascend):update README#673
lywbarca wants to merge 1 commit intoUbiquitousLearning:mainfrom
lywbarca:ascend

lywbarca commented May 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 6, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (3 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lywbarca commented May 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (3 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lywbarca commented May 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 6, 2026 •

edited

Loading