Skip to content

Fix PPORecurrent training issue: tuned learning rate and added missing max_grad_norm#4

Open
emiliof114 wants to merge 1 commit intofei-yang-wu:mainfrom
emiliof114:fix/lr-tuning
Open

Fix PPORecurrent training issue: tuned learning rate and added missing max_grad_norm#4
emiliof114 wants to merge 1 commit intofei-yang-wu:mainfrom
emiliof114:fix/lr-tuning

Conversation

@emiliof114
Copy link

This PR fixes the failed PPORecurrent training test by stabilizing optimization.

Changes:

  • Added max_grad_norm parameter to optimizer config.

  • Tuned learning rate from 3e-41e-4 for better convergence.

  • Verified that all tests pass locally (6/6).

Result:
PPORecurrent training now improves average return as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant