[doc,model] feat: Add Qwen3-235B NPU Long Sequence Optimizing Practice#5835
[doc,model] feat: Add Qwen3-235B NPU Long Sequence Optimizing Practice#5835Vvictorrrr wants to merge 18 commits intoverl-project:mainfrom
Conversation
…g Sequence Reinforcement Learning.md update optimizing doc
There was a problem hiding this comment.
Code Review
This pull request adds a comprehensive tutorial for optimizing the training performance of the Qwen3-235B model on Ascend NPU platforms, focusing on long-sequence reinforcement learning. The document covers performance bottleneck analysis and provides specific optimization strategies for inference and training. The review feedback identifies several minor issues that require correction, including typos in the 'torch-npu' component name, inconsistent section numbering, and a stray character in a configuration snippet.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…in Asynchronous Training Scenarios
…in Asynchronous Training Scenarios
| @@ -0,0 +1,171 @@ | |||
|
|
|||
| ## 一、背景概述 | |||
|
|
|||
|
|
||
| ## 一、背景概述 | ||
|
|
||
| 随着大模型后训练范式从SFT向SFT-RL-SFT演进,强化学习在大模型对齐与能力提升中扮演关键角色。基于昇腾NPU平台的Verl框架已成为主流训练工具之一,尤其在长序列推理场景下对性能与显存效率提出更高要求。 |
| @@ -0,0 +1,154 @@ | |||
|
|
|||
…in Asynchronous Training Scenarios to NPU Performance Optimization Practices of Qwen3-30B-A3B Model
…to NPU Performance Optimization Practices of Qwen3-30B-A3B Model.md
…g Sequence Reinforcement Learning.md
| @@ -0,0 +1,171 @@ | |||
| 文档更新时间:2025.11 | |||
There was a problem hiding this comment.
Last updated: 03/26/2026. 参考这种形式
|
|
||
| ## 背景概述 | ||
|
|
||
| 随着大模型规模持续增长,推理与训练的性能瓶颈日益突出,尤其在MoE架构下,通信开销、算子效率与显存管理成为制约系统吞吐的关键因素。 |
There was a problem hiding this comment.
Last updated: 03/26/2026. 参考这种形式 这里也缺少
…g Sequence Reinforcement Learning.md
…g Sequence Reinforcement Learning.md
…g Sequence Reinforcement Learning.md
|
|
||
| ## 版本环境 | ||
|
|
||
| - vLLM-Ascend: v0.11.0 |
There was a problem hiding this comment.
当前verl里vllm_ascend版本都是0.13.0,这里vLLM-Ascend: v0.11。0
| | torch | 2.7.1 | | ||
| | torch-npu | 2.7.1-0919 | | ||
|
|
||
| MindSpeed-RL 2.2.0商发配套版本: |
There was a problem hiding this comment.
这里为什么会有MindSpeed-RL 2.2.0
…g Sequence Reinforcement Learning.md
What does this PR do?
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,veomni,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,cfg,reward,fully_async,one_step_off,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][fsdp, megatron] feat: dynamic batchingTest
API and Usage Example
# Add code snippet or script demonstrating how to use thisDesign & Code Changes
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace. (If not accessible, please try the Feishu group (飞书群).)recipesubmodule, please also update the reference to the submodule commit viagit submodule update --remoteorcd recipe && git pull origin main.