Skip to content

Speech RL recipe update#24

Merged
sharonyu-115 merged 9 commits into
nvidia-china-sae:mainfrom
yuekaizhang:speech_rl
Aug 11, 2025
Merged

Speech RL recipe update#24
sharonyu-115 merged 9 commits into
nvidia-china-sae:mainfrom
yuekaizhang:speech_rl

Conversation

@yuekaizhang

@yuekaizhang yuekaizhang commented Aug 6, 2025

Copy link
Copy Markdown
Collaborator
  • Update docker image docker pull soar97/verl:app-verl0.4-vllm0.8.5-mcore0.12.2-te2.2
  • Changed datasets from aishell3 to emilia_zh
  • Fixed WER computation scripts, support compute MER for code-switch cases
  • Fixed cosyvoice3 zero_shot_zh input text normalization
  • Reran experiment using temperature=1.0 and top_p=1, disable algorithm.norm_adv_by_std_in_grpo to avoid difficulty bias
  • Updated reward function and metrics (disable errors caused by tone.)
  • Add dapo training recipe (should work with verl-project/verl@75de3de)
Model Seed-TTS test_zh CER Cosyvoice3 zero_shot_zh Comment
SFT (initialized from Qwen2-0.5B-Instruct) 1.70 % 4.26% See PR #1887
GRPO (this work, trained on AIShell-3) 1.06 % 3.01% Commit
GRPO (this work, trained on emilia_zh subset, using top_p=1, temperature=1.0) 0.87% 2.63% (1800 steps)
DAPO (this work, trained on emilia_zh subset, using top_p=1, temperature=1.0) 0.83% 2.71% (700 steps) See run_dapo.sh

@sharonyu-115 sharonyu-115 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@sharonyu-115 sharonyu-115 merged commit 3f4582c into nvidia-china-sae:main Aug 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants