Skip to content

chore: flatten monitor classes#2153

Open
mikasenghaas wants to merge 11 commits intomainfrom
chore/flatten-monitors
Open

chore: flatten monitor classes#2153
mikasenghaas wants to merge 11 commits intomainfrom
chore/flatten-monitors

Conversation

@mikasenghaas
Copy link
Copy Markdown
Member

@mikasenghaas mikasenghaas commented Mar 31, 2026

Summary

  • Flatten monitor classes: Remove Monitor ABC, NoOpMonitor, MultiMonitor, and the monitor/ subpackage. WandbMonitor and PrimeMonitor are now standalone classes in utils/, constructed directly by callers.
  • Remove global singleton: Delete setup_monitor() / get_monitor(). Monitors handle disabled state internally at init time — no call-site guards needed.
  • Decouple eval_utils from monitors: evaluate_env() now returns (metrics, outputs) instead of logging internally. The orchestrator handles all monitor logging.
  • Rename config: orchestrator.prime_monitor -> orchestrator.prime (PrimeMonitorConfig -> PrimeConfig). The old [prime_monitor] key still works in TOML configs via pydantic alias and emits a deprecation warning. CLI only supports the new --orchestrator.prime.* flags (tyro does not support aliases for nested config groups, so --orchestrator.prime-monitor.* is rejected).
  • Clean up no-ops: Remove log_eval_samples and log_final_samples stubs from PrimeMonitor.
  • Extract utility: Move sample_items_for_logging -> utils/sampling.py as sample_items.

Note

Medium Risk
Moderate risk due to refactoring and API/config renames in monitoring/logging paths; mistakes could silently drop metrics/samples or break existing configs despite the compatibility alias.

Overview
Refactors monitoring to remove the utils/monitor framework and global setup_monitor()/get_monitor() singleton, replacing it with direct construction of standalone WandbMonitor and PrimeMonitor instances in the orchestrator and trainers.

Decouples evaluation from monitoring by changing evaluate_env() to return (metrics, outputs); the orchestrator now logs eval metrics/samples explicitly to the relevant monitors.

Renames Prime monitoring config from PrimeMonitorConfig/[orchestrator.prime_monitor] to PrimeConfig/[orchestrator.prime], keeping backward compatibility via a pydantic alias and a deprecation warning, and updates docs and shared config auto-fill behavior accordingly. Also extracts rollout sampling logic into utils/sampling.py and updates monitor implementations/tests to use it.

Written by Cursor Bugbot for commit 8cdbf2d. This will update automatically on new commits. Configure here.

mikasenghaas and others added 10 commits March 31, 2026 16:03
Remove the monitor subpackage and its abstractions (Monitor ABC,
NoOpMonitor, MultiMonitor, global singleton). WandbMonitor and
PrimeMonitor are now standalone classes in utils/, constructed
directly by callers. Monitors handle disabled state internally
at init time so call sites need no guards.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The config field is now `orchestrator.prime` instead of
`orchestrator.prime_monitor`. The old name still works via
validation alias but emits a DeprecationWarning.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
log_final_samples and log_eval_samples were empty pass stubs.
Remove them and their call sites.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
evaluate_env now returns (metrics, outputs) so the orchestrator
handles all monitor logging. Removes monitor dependency from
eval_utils entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
warnings.warn with DeprecationWarning is filtered by default in
Python, so users would never see it. Print to stderr instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@mikasenghaas mikasenghaas marked this pull request as ready for review March 31, 2026 16:53
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

{
"run_id": self.run_id,
"summary": self.history[-1] if self.history else {},
"summary": {},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrimeMonitor.save_final_summary always sends empty summary

Medium Severity

PrimeMonitor.save_final_summary now sends "summary": {} to the Prime API. Previously, self.history was populated by log() appending every metrics dict, and the summary was self.history[-1] (the last logged metrics). Since history was removed from PrimeMonitor and log() no longer records metrics locally, the platform finalization call always receives an empty object — losing the final training metrics that were previously sent.

Additional Locations (1)
Fix in Cursor Fix in Web

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think history on prime monitor was always empty, can you confirm @d42me?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current platform finalize handling does not consume a summary field, so this looks fine to me, cc @JannikSt for visibility

# Conflicts:
#	src/prime_rl/orchestrator/orchestrator.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants