Skip to content

feat(rollout-skip): add dump_steps list parameter to specify particular dump steps#5812

Open
zyang6 wants to merge 3 commits intoverl-project:mainfrom
zyang6:latest_skip_rollout_v3
Open

feat(rollout-skip): add dump_steps list parameter to specify particular dump steps#5812
zyang6 wants to merge 3 commits intoverl-project:mainfrom
zyang6:latest_skip_rollout_v3

Conversation

@zyang6
Copy link
Copy Markdown
Contributor

@zyang6 zyang6 commented Mar 30, 2026

What does this PR do?

  • Add a new parameter:rollout.skip.dump_steps(a list of integers, 1-based step indices).
  • Keep the existing parameter: rollout.skip.max_dump_step (dumps/loads for steps in the range ([1, max_dump_step]).).

follow up #5556

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this
actor_rollout_ref.rollout.skip.enable=true \ # Enable the feature
actor_rollout_ref.rollout.skip.dump_dir=/path/skip_rollout/ \ # Path for saving data
actor_rollout_ref.rollout.skip.dump_steps=[1,2,5] \ # dump/load only at steps 1, 2, and 5.

Design & Code Changes

  • If dump_steps is set and non-empty:

Enable dump/load only when curr_train_step is in dump_steps.
dump_steps overrides the max_dump_step window.

  • If dump_steps is null or an empty list ([]):

Fall back to the existing window logic: dump/load while curr_train_step <= max_dump_step.

  • Compatibility

Existing configs that rely on max_dump_step continue to work unchanged.
dump_steps is optional; leaving it null (default) preserves current behavior.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@zyang6 zyang6 marked this pull request as ready for review March 30, 2026 09:05
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the dump_steps configuration parameter, allowing users to specify an explicit list of training steps for rollout data dumping and loading, which takes precedence over the max_dump_step window. The changes include updates to documentation, configuration YAML files, and the RolloutSkip utility, along with new unit tests and a typo fix in a warning message. Feedback was provided regarding the robustness of dump_steps parsing, specifically suggesting the use of json.loads to handle string-encoded lists passed via the command line to prevent potential runtime errors.

Comment on lines +128 to +131
if raw_steps is not None:
step_list = [int(x) for x in raw_steps]
if len(step_list) > 0:
self._dump_step_set = frozenset(step_list)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation for parsing dump_steps is not robust enough to handle string inputs, which can occur when the parameter is passed via the command line (e.g., dump_steps='[1,2,5]'). The documentation in docs/advance/rollout_skip.rst even provides an example with a string value. The current code [int(x) for x in raw_steps] will iterate over the characters of the string and raise a ValueError, causing a crash.

The code should be updated to handle string-encoded lists, for example by using json.loads.

Suggested change
if raw_steps is not None:
step_list = [int(x) for x in raw_steps]
if len(step_list) > 0:
self._dump_step_set = frozenset(step_list)
if raw_steps is not None:
if isinstance(raw_steps, str):
try:
raw_steps = json.loads(raw_steps)
except json.JSONDecodeError:
print(f"{self.print_mark}\033[31mWarning: Could not parse 'dump_steps' from string: '{raw_steps}'. It will be ignored.\033[0m")
raw_steps = None
if raw_steps:
try:
step_list = [int(x) for x in raw_steps]
if step_list:
self._dump_step_set = frozenset(step_list)
except (ValueError, TypeError):
print(f"{self.print_mark}\033[31mWarning: 'dump_steps' contains non-integer values: '{raw_steps}'. It will be ignored.\033[0m")

actor_rollout_ref.rollout.skip.max_dump_step=10

# Optional: dump only on selected training steps (1-based), e.g. steps 1, 2, and 5:
# actor_rollout_ref.rollout.skip.dump_steps='[1,2,5]'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is code; no annotation symbols(#) are needed. Simply state in the comments that it's optional.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants