Skip to content

Conversation

izaitsevfb
Copy link
Contributor

@izaitsevfb izaitsevfb commented Sep 17, 2025

Explanation from gpt-5-codex:

Info-level lines from the new flow disappear in Lambda because of how Python logging behaves once the runtime has already attached its own handler. On every cold start AWS sets up the root logger with a handler that streams to CloudWatch and leaves the logger at its default WARNING level. Python’s logging.basicConfig(...) (which setup_logging calls
in pytorch_auto_revert/main.py:24) only configures the root logger when it has no handlers; if one is already present, the call is a no-op—no level change, no formatter change. Even though LOG_LEVEL is DEBUG, we never actually lower the root logger’s threshold, so logging.info(...) statements from testers/autorevert_v2.py respect the still-
WARNING root level and get filtered out. The lone line you’re seeing ([WARNING] Workflow … already restarted …) comes from logging.warning(...) in workflow_checker.py:121, which survives because it meets the logger’s threshold.

In the legacy flow (see pytorch_auto_revert/testers/autorevert.py), most “logs” were literal print(...) calls. Lambda captures stdout/stderr and labels it as INFO, so those showed up even without any logging configuration. We shifted to structured logging in v2, but because the runtime’s pre-existing handler prevented basicConfig from taking effect,
everything below WARNING was silently dropped. The patch we discussed explicitly sets the root logger level (and adjusts handler levels when they’re left at NOTSET), so the value from LOG_LEVEL truly applies and the [v2] … info lines will appear once the Lambda is redeployed.


The fix verified by monkey-patching the logger setup in prod.

Copy link

vercel bot commented Sep 17, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Updated (UTC)
torchci Ignored Ignored Sep 17, 2025 11:55pm

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 18, 2025
handler = logging.StreamHandler()
handler.setLevel(numeric_level)
formatter = logging.Formatter(
"%(asctime)s %(levelname)s [%(name)s] %(message)s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we want to keep the same log format that we've been using all along?

when running the CI locally it does not match. Do we have the same format in the aws lambda? Maybe we should consider to unify both?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-no-td CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants