feat(rollout-skip): add dump_steps list parameter to specify particular dump steps#5812
feat(rollout-skip): add dump_steps list parameter to specify particular dump steps#5812zyang6 wants to merge 3 commits intoverl-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the dump_steps configuration parameter, allowing users to specify an explicit list of training steps for rollout data dumping and loading, which takes precedence over the max_dump_step window. The changes include updates to documentation, configuration YAML files, and the RolloutSkip utility, along with new unit tests and a typo fix in a warning message. Feedback was provided regarding the robustness of dump_steps parsing, specifically suggesting the use of json.loads to handle string-encoded lists passed via the command line to prevent potential runtime errors.
verl/utils/rollout_skip.py
Outdated
| if raw_steps is not None: | ||
| step_list = [int(x) for x in raw_steps] | ||
| if len(step_list) > 0: | ||
| self._dump_step_set = frozenset(step_list) |
There was a problem hiding this comment.
The current implementation for parsing dump_steps is not robust enough to handle string inputs, which can occur when the parameter is passed via the command line (e.g., dump_steps='[1,2,5]'). The documentation in docs/advance/rollout_skip.rst even provides an example with a string value. The current code [int(x) for x in raw_steps] will iterate over the characters of the string and raise a ValueError, causing a crash.
The code should be updated to handle string-encoded lists, for example by using json.loads.
| if raw_steps is not None: | |
| step_list = [int(x) for x in raw_steps] | |
| if len(step_list) > 0: | |
| self._dump_step_set = frozenset(step_list) | |
| if raw_steps is not None: | |
| if isinstance(raw_steps, str): | |
| try: | |
| raw_steps = json.loads(raw_steps) | |
| except json.JSONDecodeError: | |
| print(f"{self.print_mark}\033[31mWarning: Could not parse 'dump_steps' from string: '{raw_steps}'. It will be ignored.\033[0m") | |
| raw_steps = None | |
| if raw_steps: | |
| try: | |
| step_list = [int(x) for x in raw_steps] | |
| if step_list: | |
| self._dump_step_set = frozenset(step_list) | |
| except (ValueError, TypeError): | |
| print(f"{self.print_mark}\033[31mWarning: 'dump_steps' contains non-integer values: '{raw_steps}'. It will be ignored.\033[0m") |
docs/advance/rollout_skip.rst
Outdated
| actor_rollout_ref.rollout.skip.max_dump_step=10 | ||
|
|
||
| # Optional: dump only on selected training steps (1-based), e.g. steps 1, 2, and 5: | ||
| # actor_rollout_ref.rollout.skip.dump_steps='[1,2,5]' |
There was a problem hiding this comment.
This line is code; no annotation symbols(#) are needed. Simply state in the comments that it's optional.
What does this PR do?
follow up #5556
API and Usage Example
Design & Code Changes
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace. (If not accessible, please try the Feishu group (飞书群).)recipesubmodule, please also update the reference to the submodule commit viagit submodule update --remoteorcd recipe && git pull origin main.