[doc,model] feat: Add Qwen3-235B NPU Long Sequence Optimizing Practice by Vvictorrrr · Pull Request #5835 · verl-project/verl

Vvictorrrr · 2026-04-01T03:43:32Z

What does this PR do?

This PR updates the Qwen3-235B NPU Long Sequence Optimizing Practice, developers can refer to this doc for help.

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: ...
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, veomni, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward, fully_async, one_step_off
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.
Note related.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.
Note related.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.
Not related

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)
If your PR is related to the recipe submodule, please also update the reference to the submodule commit via git submodule update --remote or cd recipe && git pull origin main.

…g Sequence Reinforcement Learning.md update optimizing doc

gemini-code-assist

Code Review

This pull request adds a comprehensive tutorial for optimizing the training performance of the Qwen3-235B model on Ascend NPU platforms, focusing on long-sequence reinforcement learning. The document covers performance bottleneck analysis and provides specific optimization strategies for inference and training. The review feedback identifies several minor issues that require correction, including typos in the 'torch-npu' component name, inconsistent section numbering, and a stray character in a configuration snippet.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…in Asynchronous Training Scenarios

wucong25 · 2026-04-03T06:13:39Z

@@ -0,0 +1,171 @@
+
+## 一、背景概述
+


index.rst中加入这两个文档

wucong25 · 2026-04-03T06:13:48Z

+
+## 一、背景概述
+
+随着大模型后训练范式从SFT向SFT-RL-SFT演进，强化学习在大模型对齐与能力提升中扮演关键角色。基于昇腾NPU平台的Verl框架已成为主流训练工具之一，尤其在长序列推理场景下对性能与显存效率提出更高要求。


缺少update time

wucong25 · 2026-04-03T06:16:31Z

@@ -0,0 +1,154 @@
+


文件名有点过于长了

…in Asynchronous Training Scenarios to NPU Performance Optimization Practices of Qwen3-30B-A3B Model

…to NPU Performance Optimization Practices of Qwen3-30B-A3B Model.md

…g Sequence Reinforcement Learning.md

wucong25 · 2026-04-11T03:45:01Z

@@ -0,0 +1,171 @@
+文档更新时间：2025.11


Last updated: 03/26/2026. 参考这种形式

wucong25 · 2026-04-11T03:45:13Z

+
+## 背景概述
+
+随着大模型规模持续增长，推理与训练的性能瓶颈日益突出，尤其在MoE架构下，通信开销、算子效率与显存管理成为制约系统吞吐的关键因素。


Last updated: 03/26/2026. 参考这种形式这里也缺少

…g Sequence Reinforcement Learning.md

wucong25 · 2026-04-16T02:41:51Z

+
+## 版本环境
+
+- vLLM-Ascend: v0.11.0


当前verl里vllm_ascend版本都是0.13.0，这里vLLM-Ascend: v0.11。0

这个是之前做的优化了

wucong25 · 2026-04-16T02:43:35Z

+| torch       | 2.7.1           |
+| torch-npu    | 2.7.1-0919      |
+
+MindSpeed-RL 2.2.0商发配套版本：


这里为什么会有MindSpeed-RL 2.2.0

…g Sequence Reinforcement Learning.md

Create Qwen3-235B: Practice in Optimizing Training Performance of Lon…

a06a853

…g Sequence Reinforcement Learning.md update optimizing doc

Vvictorrrr requested a review from FightingZhen as a code owner April 1, 2026 03:43

gemini-code-assist bot reviewed Apr 1, 2026

View reviewed changes

Vvictorrrr and others added 4 commits April 2, 2026 19:29

Apply suggestion from @gemini-code-assist[bot]

dd3705d

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Apply suggestion from @gemini-code-assist[bot]

7db534d

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Apply suggestion from @gemini-code-assist[bot]

48d63ae

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Apply suggestion from @gemini-code-assist[bot]

8414c8e

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Vvictorrrr changed the title ~~[doc] Add Qwen3-235B NPU Long Sequence Optimizing Practice~~ [doc,model] feat: Add Qwen3-235B NPU Long Sequence Optimizing Practice Apr 2, 2026

Vvictorrrr added 2 commits April 2, 2026 19:43

Create NPU Performance Optimization Practices of Qwen3-30B-A3B Model …

4462350

…in Asynchronous Training Scenarios

Update NPU Performance Optimization Practices of Qwen3-30B-A3B Model …

3893fa6

…in Asynchronous Training Scenarios

wucong25 reviewed Apr 3, 2026

View reviewed changes

Vvictorrrr added 3 commits April 8, 2026 21:08

Rename NPU Performance Optimization Practices of Qwen3-30B-A3B Model …

858cd21

…in Asynchronous Training Scenarios to NPU Performance Optimization Practices of Qwen3-30B-A3B Model

Rename NPU Performance Optimization Practices of Qwen3-30B-A3B Model …

5518cdc

…to NPU Performance Optimization Practices of Qwen3-30B-A3B Model.md

Update Qwen3-235B: Practice in Optimizing Training Performance of Lon…

6b26b55

…g Sequence Reinforcement Learning.md

wucong25 reviewed Apr 11, 2026

View reviewed changes

Vvictorrrr added 5 commits April 13, 2026 16:00

Update NPU Performance Optimization Practices of Qwen3-30B-A3B Model.md

081f97b

Update Qwen3-235B: Practice in Optimizing Training Performance of Lon…

a3d5e46

…g Sequence Reinforcement Learning.md

Update Qwen3-235B: Practice in Optimizing Training Performance of Lon…

5a6b396

…g Sequence Reinforcement Learning.md

Update NPU Performance Optimization Practices of Qwen3-30B-A3B Model.md

04b9376

Update ascend docs in index.rst

be876aa

Vvictorrrr requested a review from eric-haibin-lin as a code owner April 13, 2026 08:09

Vvictorrrr added 2 commits April 15, 2026 16:50

Update NPU Performance Optimization Practices of Qwen3-30B-A3B Model.md

1caa090

Update Qwen3-235B: Practice in Optimizing Training Performance of Lon…

9e0178f

…g Sequence Reinforcement Learning.md

wucong25 previously approved these changes Apr 16, 2026

View reviewed changes

wucong25 reviewed Apr 16, 2026

View reviewed changes

wucong25 self-requested a review April 16, 2026 02:44

Update Qwen3-235B: Practice in Optimizing Training Performance of Lon…

3e4d7d8

…g Sequence Reinforcement Learning.md

Vvictorrrr dismissed wucong25’s stale review via 3e4d7d8 April 16, 2026 15:41


		## 一、背景概述

		随着大模型后训练范式从SFT向SFT-RL-SFT演进，强化学习在大模型对齐与能力提升中扮演关键角色。基于昇腾NPU平台的Verl框架已成为主流训练工具之一，尤其在长序列推理场景下对性能与显存效率提出更高要求。


		## 背景概述

		随着大模型规模持续增长，推理与训练的性能瓶颈日益突出，尤其在MoE架构下，通信开销、算子效率与显存管理成为制约系统吞吐的关键因素。

Conversation

Vvictorrrr commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vvictorrrr commented Apr 1, 2026 •

edited

Loading