-
Notifications
You must be signed in to change notification settings - Fork 154
Generalize MFU/FLOPs module across recipes with log_mfu training hook #1548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 15 commits
c761be5
7d60786
aa296da
f412e3b
231845e
3b891f0
1930644
7094d17
8ea45ac
d853f96
1635aab
27255a1
e6468fc
e2f4934
afc8450
f4858eb
b6685f5
ab46239
f8f84cb
909c1d7
423eab7
b979eed
44172ae
29121fe
245d6e0
74fc4b6
ff0410d
b9f31ae
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -65,3 +65,4 @@ quant_stats_config: | |
| fp8_layers: null | ||
| fp4_layers: null | ||
| use_fp32_master_weights: null | ||
| log_mfu: false | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -156,7 +156,11 @@ def main(args: DictConfig) -> float | None: | |
| start_step = 0 | ||
| epoch = 0 | ||
|
|
||
| perf_logger = PerfLogger(dist_config, args) | ||
| perf_logger = PerfLogger( | ||
| dist_config, | ||
| args, | ||
| model_config_dict=config.to_dict() if args.get("log_mfu", False) else None, | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. throughout, this could fit on the same line if you did this check inside the perf logger. In general, we want to keep these training scripts as clean as possible. perf_logger = PerfLogger(dist_config, args, model_config=config)inside perf_logger.py: if args.log_mfu:
...Do we need args.get()? we should just make this default to false in the hydra default.yaml
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in b9f31ae. Train scripts now just call |
||
| ) | ||
|
|
||
| # Training loop | ||
| step = start_step | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.