Skip to content

Conversation

izaitsevfb
Copy link
Contributor

@izaitsevfb izaitsevfb commented Sep 18, 2025

features:

  1. added ability to render html hud-like signal grid from the timestamp of the autorevert run:
python -m pytorch_auto_revert hud "2025-09-17 17:44:14"
# or
python -m pytorch_auto_revert hud "2025-09-17 17:44:14" --hud-html hud01.html
  1. added optional flag to pytorch_auto_revert to render run results in the same format:
 python -m pytorch_auto_revert --dry-run autorevert-checker Lint trunk pull inductor rocm rocm-mi300 --hours 18  --hud-html hud.html

other changes:

  1. the old state wasn't logging outcomes cleanly, added that
  2. introduced state versioning
  3. legacy state format is supported for backward-compatibility (although it's missing some non-critical info)

Warning

Disclaimer: the code (in particularly the renderer) is mostly vibecoded and is messy, however it's not critical and functions well enough for now


testing

Ran commands locally, including combo of autorevert-checker in dry-run mode + hud on its timestamp.

Copy link

vercel bot commented Sep 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Updated (UTC)
torchci Ignored Ignored Preview Sep 18, 2025 3:53pm

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 18, 2025
@@ -113,6 +113,15 @@ def get_opts() -> argparse.Namespace:
"Revert mode: skip, log (no side effects), run-notify (side effect), or run-revert (side effect)."
),
)
workflow_parser.add_argument(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this functionality should not be a flag on revert mode, instead a separate entry point that reads from the database and creates this html.

this is because I don't want to couple the autorevert logic to the html generation logic.

This is not safe for a service, it makes more sense for those to be separated entities/actions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym?

there IS a separate endpoint: hud subcommand that reads and prints from the database

the flag to the autorevert just reuses the functionality to optionally dump the results of the run as html, it's a convenience

@@ -17,15 +18,16 @@ def autorevert_v2(
repo_full_name: str = "pytorch/pytorch",
restart_action: RestartAction = RestartAction.RUN,
revert_action: RevertAction = RevertAction.LOG,
) -> Tuple[List[Signal], List[Tuple[Signal, SignalProcOutcome]]]:
out_hud: Optional[str] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this entry point is going to be used in production, I believe that we should not add the html logic to it. Unnecessary complexity that adds risk in production.


from ..hud_renderer import build_grid_model, render_html
from ..signal_extraction import SignalExtractor
from ..hud_renderer import render_html_from_state
Copy link
Contributor

@jeanschmidt jeanschmidt Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hud_renderer generates html by concatenation,

I believe we should move away from this, this is NOT maintainable or easy to understand. Please replace with a proper template compiler engine. It is OK to have the template as a string in a file.


Returns the output filepath.
"""
def _ensure_state_dict(state: RunStatePayload) -> Mapping[str, Any]:
Copy link
Contributor

@jeanschmidt jeanschmidt Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a strict jsonschema validation here

otherwise errors will be random, criptic and hard to understand if someone provides accidentally the wrong json file.

my recommendation:

Use the same type to typehint, jsonschema validate on load AND on write to the database. In order to avoid bugs poisoning the database.


Returns the output filepath.
"""
def _ensure_state_dict(state: RunStatePayload) -> Mapping[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return proper typing, specifying the correct full json structure. Probably it is too complex to put here, but please build typeclasses and typedefinitions.

if you used Any as a type, you're doing it wrong.

Extracts signals for the given workflows, optionally truncates commit history
to ignore commits newer than a specific SHA, builds a HUD model, and writes
an HTML report to `out`.
RunStatePayload = Union[str, Mapping[str, Any]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for datatype here, avoid Any, be strict and build the full json datatype.

"""Render the given run-state JSON (string or mapping) to HUD HTML."""
state_dict = _ensure_state_dict(state)
meta = state_dict.get("meta", {})
workflows = meta.get("workflows") or []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workflows = meta.get("workflows", [])

row = rows[0]
repo = row["repo"]
workflows = row["workflows"]
state_json = row["state"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-no-td CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants