Skip to content

Fix v17 group b warm restart archiving#4502

Merged
JessicaMeixner-NOAA merged 3 commits intoNOAA-EMC:dev/gfs.v17from
DavidHuber-NOAA:fix/arch_warm_v17
Feb 4, 2026
Merged

Fix v17 group b warm restart archiving#4502
JessicaMeixner-NOAA merged 3 commits intoNOAA-EMC:dev/gfs.v17from
DavidHuber-NOAA:fix/arch_warm_v17

Conversation

@DavidHuber-NOAA
Copy link
Contributor

Description

This fixes warm restart archiving for the v17 branch. For group b, the number of days between SDATE and the archive cycle is now correct.

Resolves #4501

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs Yes
    • GFS (restart archiving)
    • GEFS
    • SFS
    • GCAFS
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

To be tested in a C96C48 case and in retros

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added

@JessicaMeixner-NOAA
Copy link
Contributor

@DavidHuber-NOAA - one quick question --- so the enkfgdas restartb's were find but the gdas restartb's were not. I don't quite understand how this was different for each of these. It makes sense to not account for a particular SDATE, but what would be a good way to test this otherwise to make sure things are working as expected... versus getting 7 days out and noticing things aren't as expected?

@DavidHuber-NOAA
Copy link
Contributor Author

@JessicaMeixner-NOAA I will test this (after applying an additional fix to the gdas archiving) with ARCH_WARMICFREQ=2, SDATE=2021122112, and ARCH_CYC=00 on a C96C48 case.

@DavidHuber-NOAA
Copy link
Contributor Author

I've launched a C96C48mx500_S2SW_cyc_gfs test case on Ursa with the following modifications:

SDATE=2021122012
EDATE=2021122406
ARCH_CYC=00
ARCH_WARMICFREQ=2

This should create full warm restarts on cycles 2021122218 (group b) and 2021122300 (group a). I will post again when the test completes.

@DavidHuber-NOAA DavidHuber-NOAA marked this pull request as draft February 2, 2026 17:08
@DavidHuber-NOAA DavidHuber-NOAA marked this pull request as ready for review February 3, 2026 13:21
@DavidHuber-NOAA
Copy link
Contributor Author

After running the test, which I ended up extending to 2021122500, I was able to verify that the 2021122218 and 2021122300 restarts were correct. One somewhat surprising result was that a full suite of restarts was also generated for 2021122018 and 2021122100. I can see why this happens, namely

days_since_sdate = (arch_dict.current_cycle - SDATE).days
if arch_dict.ARCH_FCSTICFREQ > 0 and days_since_sdate % arch_dict.ARCH_FCSTICFREQ == 0:
# We are on the right cycle hour and the right day
return True

and
days_since_sdate = (ics_offset_cycle - SDATE).days
if arch_dict.ARCH_WARMICFREQ > 0 and days_since_sdate % arch_dict.ARCH_WARMICFREQ == 0:
# We are on the right cycle hour and the right day
return True

Since we are only looking for mod(days_since_sdate, 2) == 0, it follows that if days_since_sdate = 0, that operation would yield 0. I think I am OK with this as it will enable easy testing in the future that restarts are correctly archived, though it may appear inconsistent to users. Experiments starting on 06Z or 12Z will have restarts written on the first 18Z and 00Z cycle while experiments starting on 00Z or 18Z will not have sets written until ARCH_WARMICFREQ days later.

@JessicaMeixner-NOAA @CatherineThomas-NOAA @RuiyuSun @LydiaStefanova-NOAA Here is the listing of the restarts written for the 2021122218 and 2021122300 cycles. Can you please verify these are correct:

/NCEPDEV/emc-global/1year/David.Huber/URSA/scratch/C96C48mx500_S2SW_cyc_gfs_restart_warm_dev/2021122218:
enkfgdas_restartb_grp1.tar
gdasocean_analysis.tar
gdasocean_restart.tar
gdaswave_restart.tar
gdas_restartb.tar

/NCEPDEV/emc-global/1year/David.Huber/URSA/scratch/C96C48mx500_S2SW_cyc_gfs_restart_warm_dev/2021122300:
gdasice_restart.tar
enkfgdas_restarta_grp1.tar
gdasocean_analysis.tar
gfs_restarta.tar
gdas_restarta.tar

@DavidHuber-NOAA
Copy link
Contributor Author

Opening for review, though I will complete a full suite of tests on Cactus (when it comes back up) and on Gaea C6.

@emcbot emcbot added the CI-Gaeac6-Ready **CM use only** PR is ready for CI testing on Gaea C6 label Feb 3, 2026
@JessicaMeixner-NOAA
Copy link
Contributor

Thanks for the fix, explanations and the testing!

@DavidHuber-NOAA
Copy link
Contributor Author

Experiments starting on 06Z or 12Z will have restarts written on the first 18Z and 00Z cycle while experiments starting on 00Z or 18Z will not have sets written until ARCH_WARMICFREQ days later.

Noting for posterity that users may set ARCH_WARMICFREQ=1 in config.arch_tars until the first full restarts are archived and then revert it to 4, if it is desired to retain the first full set for a case starting on 00Z or 18Z.

@emcbot emcbot added CI-Gaeac6-Building **Bot use only** CI testing is cloning/building on Gaea C6 CI-Gaeac6-Failed **Bot use only** CI testing on Gaea C6 for this PR has failed and removed CI-Gaeac6-Ready **CM use only** PR is ready for CI testing on Gaea C6 CI-Gaeac6-Building **Bot use only** CI testing is cloning/building on Gaea C6 labels Feb 3, 2026
@TerrenceMcGuinness-NOAA TerrenceMcGuinness-NOAA added CI-Gaeac6-Building **Bot use only** CI testing is cloning/building on Gaea C6 and removed CI-Gaeac6-Failed **Bot use only** CI testing on Gaea C6 for this PR has failed labels Feb 3, 2026
@emcbot emcbot added CI-Gaeac6-Failed **Bot use only** CI testing on Gaea C6 for this PR has failed and removed CI-Gaeac6-Building **Bot use only** CI testing is cloning/building on Gaea C6 labels Feb 3, 2026
@DavidHuber-NOAA
Copy link
Contributor Author

DavidHuber-NOAA commented Feb 3, 2026

@TerrenceMcGuinness-NOAA I think PRs going into the dev/gfs.v17 branch cannot run on automatic CI (it was worth a try), so I will run this manually.

@DavidHuber-NOAA DavidHuber-NOAA added CI-GaeaC6-Running (CM) CI testing is being run locally on Gaea C6. and removed CI-Gaeac6-Failed **Bot use only** CI testing on Gaea C6 for this PR has failed labels Feb 3, 2026
@TerrenceMcGuinness-NOAA
Copy link
Collaborator

@DavidHuber-NOAA Oh yes I was just noticing it did not have the requisite hash for gdas. Makes sense now:

From https://github.com/NOAA-EMC/global-workflow
 * [new ref]           refs/pull/4502/head -> fix/arch_warm_v17
Previous HEAD position was fb82cf08 Hotfix: Missing Cron Script Creation for Regular Crontab (#4497)
Switched to branch 'fix/arch_warm_v17'
M	sorc/gdas.cd
M	sorc/gfs_utils.fd
M	sorc/ufs_model.fd
M	sorc/ufs_utils.fd
M	sorc/wxflow
fatal: remote error: upload-pack: not our ref b747e5c70a5f463fcaf5a64b1816df52a7a016a5
fatal: Fetched in submodule path 'sorc/gdas.cd', but it did not contain b747e5c70a5f463fcaf5a64b1816df52a7a016a5. Direct fetching of that commit failed.

@DavidHuber-NOAA DavidHuber-NOAA added the CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress label Feb 3, 2026
@DavidHuber-NOAA
Copy link
Contributor Author

All tests completed successfully on C6. Cactus has started running.

@DavidHuber-NOAA DavidHuber-NOAA added CI-Gaeac6-Passed (cm) Manual CI passed on Gaea C6 and removed CI-GaeaC6-Running (CM) CI testing is being run locally on Gaea C6. labels Feb 3, 2026
@DavidHuber-NOAA
Copy link
Contributor Author

All GFS tests also passed on WCOSS2:

Wed Feb  4 12:46:16 UTC 2026
******** C48_ATM_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103231200        Done    Feb 03 2026 19:45:12    Feb 03 2026 20:40:29

******** C48mx500_3DVarAOWCDA_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103241800        Done    Feb 03 2026 19:45:27    Feb 03 2026 20:00:33
202103250000        Done    Feb 03 2026 19:45:27    Feb 03 2026 21:20:22

******** C48mx500_hybAOWCDA_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103241800        Done    Feb 03 2026 19:45:05    Feb 03 2026 20:00:16
202103250000        Done    Feb 03 2026 19:45:05    Feb 03 2026 21:05:13

******** C48_S2SW_extended_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103231200        Done    Feb 03 2026 19:45:08    Feb 03 2026 22:35:15
202103231800        Done    Feb 03 2026 19:45:08    Feb 03 2026 22:05:23

******** C96_atm3DVar_extended_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202112201800        Done    Feb 03 2026 19:45:06    Feb 03 2026 20:05:14
202112210000        Done    Feb 03 2026 19:45:06    Feb 04 2026 00:25:26
202112210600        Done    Feb 03 2026 19:45:06    Feb 04 2026 01:20:12
202112211200        Done    Feb 03 2026 20:10:14    Feb 04 2026 02:40:13
202112211800        Done    Feb 04 2026 00:30:16    Feb 04 2026 05:10:10

******** C96C48_hybatmDA_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202112201800        Done    Feb 03 2026 19:45:24    Feb 03 2026 20:05:15
202112210000        Done    Feb 03 2026 19:45:24    Feb 03 2026 22:00:20
202112210600        Done    Feb 03 2026 19:45:24    Feb 03 2026 21:50:09

******** C96C48_hybatmsnowDA_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202112201200        Done    Feb 03 2026 21:20:22    Feb 03 2026 21:40:25
202112201800        Done    Feb 03 2026 21:20:22    Feb 03 2026 23:35:24
202112210000        Done    Feb 03 2026 21:20:22    Feb 03 2026 23:35:24

******** C96C48_hybatmsoilDA_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202205150600        Done    Feb 03 2026 19:45:35    Feb 03 2026 20:05:16
202205151200        Done    Feb 03 2026 19:45:35    Feb 03 2026 22:00:26
202205151800        Done    Feb 03 2026 19:45:35    Feb 03 2026 22:00:26

******** C96C48mx500_S2SW_cyc_gfs_4502 ********
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202112201200        Done    Feb 03 2026 19:45:33    Feb 03 2026 20:05:20
202112201800        Done    Feb 03 2026 19:45:33    Feb 03 2026 22:30:25
202112210000        Done    Feb 03 2026 19:45:33    Feb 03 2026 22:50:19
202112211800        Done    Feb 03 2026 20:10:10    Feb 03 2026 23:00:22

@DavidHuber-NOAA DavidHuber-NOAA added CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully and removed CI-Wcoss2-Running CI testing on WCOSS for this PR is in-progress labels Feb 4, 2026
@DavidHuber-NOAA
Copy link
Contributor Author

FYI @LydiaStefanova-NOAA this PR will fix the restart archiving issue you noted in the dev/gfs.v17 branch.

Copy link
Contributor

@CatherineThomas-NOAA CatherineThomas-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @DavidHuber-NOAA!

@RuiyuSun
Copy link

RuiyuSun commented Feb 4, 2026

After running the test, which I ended up extending to 2021122500, I was able to verify that the 2021122218 and 2021122300 restarts were correct. One somewhat surprising result was that a full suite of restarts was also generated for 2021122018 and 2021122100. I can see why this happens, namely

days_since_sdate = (arch_dict.current_cycle - SDATE).days
if arch_dict.ARCH_FCSTICFREQ > 0 and days_since_sdate % arch_dict.ARCH_FCSTICFREQ == 0:
# We are on the right cycle hour and the right day
return True

and

days_since_sdate = (ics_offset_cycle - SDATE).days
if arch_dict.ARCH_WARMICFREQ > 0 and days_since_sdate % arch_dict.ARCH_WARMICFREQ == 0:
# We are on the right cycle hour and the right day
return True

Since we are only looking for mod(days_since_sdate, 2) == 0, it follows that if days_since_sdate = 0, that operation would yield 0. I think I am OK with this as it will enable easy testing in the future that restarts are correctly archived, though it may appear inconsistent to users. Experiments starting on 06Z or 12Z will have restarts written on the first 18Z and 00Z cycle while experiments starting on 00Z or 18Z will not have sets written until ARCH_WARMICFREQ days later.

@JessicaMeixner-NOAA @CatherineThomas-NOAA @RuiyuSun @LydiaStefanova-NOAA Here is the listing of the restarts written for the 2021122218 and 2021122300 cycles. Can you please verify these are correct:

/NCEPDEV/emc-global/1year/David.Huber/URSA/scratch/C96C48mx500_S2SW_cyc_gfs_restart_warm_dev/2021122218:
enkfgdas_restartb_grp1.tar
gdasocean_analysis.tar
gdasocean_restart.tar
gdaswave_restart.tar
gdas_restartb.tar

/NCEPDEV/emc-global/1year/David.Huber/URSA/scratch/C96C48mx500_S2SW_cyc_gfs_restart_warm_dev/2021122300:
gdasice_restart.tar
enkfgdas_restarta_grp1.tar
gdasocean_analysis.tar
gfs_restarta.tar
gdas_restarta.tar

@DavidHuber-NOAA After comparing your filelist here and the one we used in the script to retrieve the restart files, I confirm that your list contains all the necessary pieces.

@JessicaMeixner-NOAA JessicaMeixner-NOAA merged commit 2b06abd into NOAA-EMC:dev/gfs.v17 Feb 4, 2026
7 checks passed
DavidHuber-NOAA added a commit that referenced this pull request Feb 6, 2026
# Description
This fixes the GDAS-cycle warm restart archiving for group b (previous
cycle) by adding `assim_freq` to the current cycle when checking the
number of days since `SDATE`. This is a companion PR to #4502, which
fixes the same issue in the dev/gfs.v17 branch.

Resolves #4501 for develop

# Type of change
- [x] Bug fix (fixes something broken)
- [ ] New feature (adds functionality)
- [ ] Maintenance (code refactor, clean-up, new CI test, etc.)

# Change characteristics
- Is this change expected to change outputs YES
  - [x] GFS (warm restarts)
  - [ ] GEFS
  - [ ] SFS
  - [ ] GCAFS
- Is this a breaking change (a change in existing functionality)? NO
- Does this change require a documentation update? NO
- Does this change require an update to any of the following submodules?
NO

# How has this been tested?
Will test warm restart capability

# Checklist
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] My changes generate no new warnings
- [ ] New and existing tests pass with my changes
- [x] This change is covered by an existing CI test or a new one has
been added
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-Gaeac6-Passed (cm) Manual CI passed on Gaea C6 CI-Wcoss2-Passed CI testing on WCOSS for this PR has completed successfully

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants