Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
92c454a
[reward] fix: reward model args and reward_kwargs bug (#5289)
yyDing1 Feb 12, 2026
8bcf908
[doc] chore: gspo update config and add version with npu (#5279)
chengminhua Feb 12, 2026
9cb89d8
[fsdp,veomni] fix: remove FSDPUlyssesShardingManager to make eval_mod…
wuxibin89 Feb 12, 2026
a43eecb
[veomni] refactor: Modify dp related parameters to align with FSDP ba…
ChengQianqian Feb 13, 2026
4dd4980
[trtllm] feat: use max utilization scheduler by default (#5302)
tongyuantongyu Feb 13, 2026
9682330
[worker, tool] fix: stabilize agent loop extra fields schema (#5301)
denismegerle Feb 13, 2026
9917f76
[algo] feat: add NPU SAPO training script for Qwen3-8B (FSDP/vLLM bac…
Vvictorrrr Feb 13, 2026
a22d51e
[fsdp, vllm] feat: add NPU GRPO training scripts for Qwen3-VL-8B (FSD…
zhihaofang1017 Feb 13, 2026
2703d73
[fsdp, vllm] feat: add NPU GRPO training scripts for Qwen3-VL-30B (FS…
alwaysyiyu Feb 13, 2026
536a978
[model,cfg] fix: type annotation for Lora target_modules (#5223)
thvasilo Feb 13, 2026
ec123e6
[megatron] feat: Support LoRA training with FP16 using Megatron-Bridg…
xichengpro Feb 13, 2026
0c5cc48
[ci] fix: main pre-commit (#5318)
pengwu22 Feb 14, 2026
395938b
[misc] refactor: delete remaining batch-mode code in single controlle…
ji-huazhong Feb 14, 2026
cf97337
[rollout] fix: make skip rollout compatible with async mode (#5320)
ChengQianqian Feb 14, 2026
8c41724
[veomni, trainer] fix: padding pixel value with padding_scale for vl …
A1waysBeenHere Feb 14, 2026
27f65f1
[fsdp,algo] feat: add NVFP4 QAT (Quantization-Aware Training) support…
zhangyimi Feb 14, 2026
e8a484c
[docs] Add new awesome work using Verl (#5328)
MING-ZCH Feb 16, 2026
4c9e3f7
[vllm] feat: remove workers from vLLMHttpServer (#5330)
tongyx361 Feb 16, 2026
ef26847
Revert "[vllm] feat: remove workers from vLLMHttpServer" (#5333)
PeterSH6 Feb 16, 2026
54d41ca
[misc] refactor: remove deprecated codes (#5336)
ji-huazhong Feb 18, 2026
52d8ba9
[misc] fix: include config files for experimental entrypoints in pack…
guillemgt Feb 18, 2026
28550a7
[ci] chore: set torch-npu to 2.7.1.post2 in ascend dockerfile (#5345)
ji-huazhong Feb 18, 2026
45ae86e
Revert "[ci] chore: set torch-npu to 2.7.1.post2 in ascend dockerfile…
ji-huazhong Feb 19, 2026
eec88a0
[reward] fix: empty class_dict for standalone reward model resource p…
yyDing1 Feb 19, 2026
f5c34bb
[trainer] feat: Add Torchtitan as alternative training engine (#5051)
acisseJZhong Feb 20, 2026
37ff251
[training_utils] fix: mask out-of-bounds vocab entries fused kernel L…
EricMarcus-ai Feb 20, 2026
c6255ae
[rollout] fix: Include routed_experts in ToolAgentLoop return value t…
mirrorboat Feb 23, 2026
f56c893
[misc] fix: pass torch dtype when init random model (#5370)
HollowMan6 Feb 23, 2026
58f38fb
[ci] chore: pin version cupy-cuda12x==13.6.0 (#5377)
wuxibin89 Feb 24, 2026
712de01
[doc] chore: ascend add performance analysis guide and update some ve…
chengminhua Feb 24, 2026
cb60e70
[trainer] feat: Support RL trainer with TorchtitanEngine (#5356)
acisseJZhong Feb 24, 2026
3671d37
[algo] feat: Exception for agg_loss when `dp_size > 1` but global inf…
tongyx361 Feb 24, 2026
3309a15
[rollout] fix: make `run_uvicorn` behavior more reliable (#5383)
tongyuantongyu Feb 24, 2026
8acd940
[doc] feat: update documentation for The Optimal Token Baseline and R…
jiawei415 Feb 24, 2026
4c8101d
[trainer] refactor: remove fsdp_sft_trainer.py (#5382)
wuxibin89 Feb 24, 2026
631d797
[ci] fix: occasional CI failures caused by sglang server port conflic…
pengwu22 Feb 24, 2026
3481d6e
[fsdp] fix: add aggressive_empty_cache at end of init_model to preven…
EricMarcus-ai Feb 24, 2026
ea042c2
[doc, worker] feat: Enable Megatron-Bridge for MTP (#5323)
HollowMan6 Feb 25, 2026
6f4942b
[ckpt] feat: add kimi ckpt engine backend (#4954)
kip-cxj Feb 25, 2026
3eb2a4a
[misc] feat: ignore pyrightconfig.json to allow users to customize py…
tongyx361 Feb 25, 2026
9433f8a
[ci] chore: update triton-ascend and fix npu ut (#5396)
yyyy2000 Feb 25, 2026
9c75bfe
[fsdp, megatron] feat: refactor fully-async and one-step-off training…
Shangwei-Li Feb 26, 2026
e3b187a
[doc] feat: add `fully async` and `one step off` to PR Checklist (#5404)
ArronHZG Feb 26, 2026
30b290e
[doc] chore: ascend update gspo optimization practice document (#5408)
chengminhua Feb 26, 2026
182383b
[algo] feat: add DPPO with binary TV or binary KL implementation (#5397)
QPHutu Feb 26, 2026
03396b0
[doc] chore: npu best practice doc (#5415)
hustmf Feb 26, 2026
b8d91ef
[algo] fix: seq mean and default scale factor `loss_mask.shape[-1]` a…
tongyx361 Feb 26, 2026
6b0bff3
[megatron] fix: missing model offload to CPU for forward_only mode (#…
xhx1022 Feb 26, 2026
b5979db
[megatron] feat: enhance model offloading and loading for frozen para…
RobotGF Feb 26, 2026
5f7c345
[perf] fix: the overwritten of Torch_profile with multi steps. (#5395)
Rhetee Feb 27, 2026
32705dc
[trainer] feat: add padding for tensor alignment in preprocess_thd_no…
RobotGF Feb 27, 2026
9dd447e
[tool] fix: handle empty image inputs in ToolAgentLoop (#5420)
denismegerle Feb 27, 2026
c3e3970
[rollout, data] fix: honor train_max_samples/val_max_samples in fully…
denismegerle Feb 27, 2026
48b367b
[tool] refactor: remove tool schema plumbing from SingleTurnAgentLoop…
denismegerle Feb 27, 2026
c3fc222
[misc] feat: Add code for data grouping in no-padding scenario (#5424)
Kite0011 Feb 27, 2026
cab2422
[doc] add Dr. MAS to awesome work (#5427)
langfengQ Feb 27, 2026
de6a1bc
[BREAKING][rollout,cfg] refactor: get rid of actor_rollout_ref config…
wuxibin89 Feb 28, 2026
a0c3333
[ci] chore: bump the version of vllm-ascend to v0.11.0 in the ascend …
ji-huazhong Feb 28, 2026
c179476
[doc] chore: fix npu docs (#5428)
wucong25 Feb 28, 2026
0ba85c4
[doc] fix: fix npu retool docs (#5449)
LeoYao123 Mar 2, 2026
d4afed3
[data] refactor: TransferQueue - retire legacy integration codes (#5454)
0oshowero0 Mar 2, 2026
3015e01
[ci] fix: failed trtllm_unit_tests with attribute error (#5446)
HollowMan6 Mar 2, 2026
de87452
[megatron] fix: pass dp_group to rearrange_micro_batches to fix DeepE…
xhx1022 Mar 2, 2026
4ce150f
[rollout] fix: remove unexpected concurrency bound at 1000 (#5402)
tongyuantongyu Mar 2, 2026
bb351a2
Merge branch 'verl-project:main' into main
SchumiDing Mar 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
/verl/workers/actor/megatron_actor.py @ISEEKYAN @vermouth1992
/verl/workers/critic/megatron_critic.py @ISEEKYAN @vermouth1992
/verl/workers/megatron_workers.py @ISEEKYAN @vermouth1992
/verl/experimental @wuxibin89 @ArronHZG

/tests/single_controller @zw0610 @wuxibin89
/tests/trainer @eric-haibin-lin @vermouth1992 @tongyx361 @PeterSH6
Expand Down
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

- [ ] Search for similar PRs. Paste at least one query link here: ...
- [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward`
- `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward`, `fully_async`, `one_step_off`
- If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]`
- `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title.
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/e2e_ascend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@ jobs:
ray stop --force
export PYTHONPATH=$PYTHONPATH:/Megatron-LM
USE_DIST_CKPT=True USE_DUMMY_MODEL=True DUMMY_MODEL_CONFIG_PATH=tests/special_e2e/ppo_trainer/expert_parallel/qwen3moe_minimal.json DUMMY_MODEL_PATH=$HOME/dist_ckpt/qwen3_30b_grpo_mindspeed bash tests/special_npu/run_qwen3_30b_grpo_mindspeed.sh
- name: Running the E2E test with fully_async_policy algorithm (FSDP2)
run: |
ray stop --force
bash tests/special_npu/run_fully_async_policy.sh

vlm_rl_job:
if: github.repository_owner == 'verl-project'
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/e2e_one_step_off_policy_ascend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ on:
# Entrypoints
- ".github/workflows/e2e_one_step_off_policy_ascend.yml"
- "examples/data_preprocess/gsm8k.py"
- "tests/special_e2e/run_one_step_off_policy.sh"
- "tests/special_npu/run_one_step_off_policy.sh"

# Cancel jobs on the same ref if a new one is triggered
concurrency:
Expand Down Expand Up @@ -122,7 +122,7 @@ jobs:
- name: Running the E2E test with one_step_off_policy algorithm (FSDP2)
run: |
ray stop --force
bash tests/special_e2e/run_one_step_off_policy.sh
bash tests/special_npu/run_one_step_off_policy.sh

# Test Megatron strategy
e2e_one_step_off_policy_megatron_ascend:
Expand Down Expand Up @@ -167,4 +167,4 @@ jobs:
run: |
ray stop --force
export PYTHONPATH=$PYTHONPATH:/Megatron-LM
bash tests/special_e2e/run_one_step_off_policy.sh
bash tests/special_npu/run_one_step_off_policy.sh
2 changes: 1 addition & 1 deletion .github/workflows/e2e_ppo_trainer_veomni_vllm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ jobs:
- name: Running GEO3K E2E training tests on 8 L20 GPUs with veomni engine (FSDP_SIZE=8, USP=1)
run: |
ray stop --force
MODEL_ID=Qwen/Qwen3-VL-2B-Instruct TRAIN_FILES=${HOME}/data/geo3k/train.parquet VAL_FILES=${HOME}/data/gsm8k/test.parquet VAL_BEFORE_TRAIN=True NUM_GPUS=8 FSDP_SIZE=8 SP_SIZE=1 EP_SIZE=1 VERL_EXP_NAME="qwen3-2b-vl-function-reward-minimal-fsdp-size8" bash tests/special_e2e/run_ppo_trainer_veomni.sh
MODEL_ID=Qwen/Qwen3-VL-2B-Instruct TRAIN_FILES=${HOME}/data/geo3k/train.parquet VAL_FILES=${HOME}/data/gsm8k/test.parquet VAL_BEFORE_TRAIN=True NUM_GPUS=8 FSDP_SIZE=4 SP_SIZE=2 EP_SIZE=1 VERL_EXP_NAME="qwen3-2b-vl-function-reward-minimal-fsdp-size8" bash tests/special_e2e/run_ppo_trainer_veomni.sh

cleanup:
runs-on: ubuntu-latest
Expand Down
10 changes: 1 addition & 9 deletions .github/workflows/e2e_sft_llm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ jobs:
- name: Prepare gsm8k dataset
run: |
ray stop --force
python3 examples/data_preprocess/gsm8k.py --local_dataset_path ${HOME}/models/hf_data/gsm8k
python3 examples/data_preprocess/gsm8k_multiturn_sft.py --local_dataset_path ${HOME}/models/hf_data/gsm8k
- name: Running GSM8K E2E training tests on 8 L20 GPUs with rmpad using function rm
run: |
ray stop --force
Expand All @@ -123,10 +123,6 @@ jobs:
run: |
ray stop --force
SP_SIZE=2 bash tests/special_e2e/sft/run_sft.sh
- name: Check loss difference between sequence parallel vs. default implementation
run: |
ray stop --force
ENTRYPOINT="tests/special_e2e/sft/test_sp_loss_match.py" SP_SIZE=2 bash tests/special_e2e/sft/run_sft.sh
- name: Running GSM8K E2E training tests on 8 L20 GPUs with sequence parallism and liger
run: |
ray stop --force
Expand All @@ -140,10 +136,6 @@ jobs:
ray stop --force
LORA_RANK=32 RESUME_MODE=auto TOTAL_TRAIN_STEP=2 bash tests/special_e2e/sft/run_sft.sh
# TODO: multiturn
- name: Prepare gsm8k dataset
run: |
ray stop --force
python3 examples/data_preprocess/gsm8k_multiturn_sft.py --local_dataset_path ${HOME}/models/hf_data/gsm8k
- name: Running GSM8K E2E training tests with multiturn and various configs and compare results
run: |
bash tests/special_e2e/sft/test_sft_engine_all.sh
Expand Down
11 changes: 1 addition & 10 deletions .github/workflows/e2e_sft_llm_ascend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ jobs:
ln -s /root/.cache/models ~/models
- name: Prepare gsm8k dataset
run: |
python examples/data_preprocess/gsm8k.py --local_dataset_path ${HOME}/.cache/datasets/openai/gsm8k
python3 examples/data_preprocess/gsm8k_multiturn_sft.py --local_dataset_path ${HOME}/.cache/datasets/openai/gsm8k
- name: Running GSM8K E2E training tests on 8 NPUs with rmpad using function rm
run: |
ray stop --force
Expand All @@ -122,10 +122,6 @@ jobs:
run: |
ray stop --force
SP_SIZE=2 bash tests/special_e2e/sft/run_sft.sh
- name: Check loss difference between sequence parallel vs. default implementation
run: |
ray stop --force
ENTRYPOINT="tests/special_e2e/sft/test_sp_loss_match.py" SP_SIZE=2 bash tests/special_e2e/sft/run_sft.sh
- name: Running GSM8K E2E training tests with LoRA
run: |
ray stop --force
Expand All @@ -134,11 +130,6 @@ jobs:
run: |
ray stop --force
LORA_RANK=32 RESUME_MODE=auto TOTAL_TRAIN_STEP=2 bash tests/special_e2e/sft/run_sft.sh
# TODO: multiturn
- name: Prepare gsm8k dataset
run: |
ray stop --force
python3 examples/data_preprocess/gsm8k_multiturn_sft.py --local_dataset_path ${HOME}/.cache/datasets/openai/gsm8k
- name: Running GSM8K E2E training tests with multiturn and various configs and compare results
run: |
export PYTHONPATH=$PYTHONPATH:/Megatron-LM
Expand Down
172 changes: 0 additions & 172 deletions .github/workflows/e2e_transferqueue.yml

This file was deleted.

Loading
Loading