Revamp obs staging and analysis stats job#4306
Revamp obs staging and analysis stats job#4306DavidHuber-NOAA merged 73 commits intoNOAA-EMC:developfrom
Conversation
This PR, a companion to NOAA-EMC/global-workflow#4306 and NOAA-EMC/GDASApp#1999 makes a small change to the obs statistics application, so that the key `obs spaces` becomes `observers`. This way, the YAML structure for obs spaces is identical to all other JEDI configuration YAMLs and the `clean_empty_obsspaces()` method in the GW `Jedi` class can be applied to the configuration YAML for this application as well. --------- Co-authored-by: Cory Martin <[email protected]>
…1999) # Description The companion PR supports changes made in the NOAA-EMC/global-workflow#4306 in the following ways: 1. It consolidates the base yaml and JEDI config YAMLs for the analysis stats job into a master config YAML and adds keys for the wxflow FileHandler for staging and saving files for that job. 2. Any obs staging YAMLs are removed since that functionality is now handled by the `Jedi` class in GW. 3. COM paths are removed from all JCB base YAMLs. Now, these paths are passed as method arguments to the obs staging and diags saving methods that now exist in the `Jedi` class. This way, the JCB base YAML is strictly concerned with paths and file names for the run directories. 4. Changes are made to the JCB base YAMLs for the analysis stats job in GW to be consistent with its refactoring in the companion PR. 5. An obs list is created for the analysis stats job, determining which obs to create statistics for. Additionally, `copy` and `link` keys in various YAMLs which go into the FileHandler for several jobs become `copy_req` and `link_req`, since the former is no deprecated. # Companion PRs NOAA-EMC/global-workflow#4306 NOAA-EMC/jcb-gdas#214 NOAA-EMC/jcb-algorithms#17 NOAA-EMC/DA-utils#49 # Issues Refs NOAA-EMC/global-workflow#4224 Refs NOAA-EMC/global-workflow#4228 # Automated CI tests to run in Global Workflow CI testing will be performed as part of review for the GW companion PR. Preliminary testing has already been performed.
|
All companion PRs have been merged and GDAS hash has been updated to develop |
|
Great, thank you @DavidNew-NOAA. I'll merge in GW develop now and start testing on WCOSS2. |
|
@DavidNew-NOAA The Could you please take a look? The log file can be found here: |
|
@DavidNew-NOAA It's possible that this is being caused by the |
|
Unfortunately, I received the same error with that change added in. |
|
@DavidHuber-NOAA I don't have access to WCOSS to look at the logs. Is it possible to transfer that log file to Hera/Ursa? I'm not sure what could be breaking gdas_sfcanl |
|
@DavidNew-NOAA Apologies for the late reply. Here is the log file on Ursa: /scratch3/NCEPDEV/global/David.Huber/for_daveN/gfs_sfcanl.log. I'm running some additional tests on WCOSS2 myself to see if this is an issue in develop. Interestingly, both the 00Z and 12Z cycles ran fine for the extended test. This only happened on the 06z cycle. If this doesn't reproduce in develop, then I will work on creating a reproducer that can be run on Ursa. |
|
It may be worth running a |
|
Thanks @DavidHuber-NOAA already ahead of you, running CI now. I'll take a look on Sunday. |
|
The develop branch also produces this error, so it appears we have a bug elsewhere that needs squashing. Merging based on otherwise successful testing. |
|
Thanks @DavidHuber-NOAA ! |
|
@DavidNew-NOAA @RussTreadon-NOAA @CoryMartin-NOAA It appears that this update to the GDASApp broke the capability to build the GDASApp on compute nodes. The build fails on WCOSS2, Ursa, and Hera while attempting to run pip. The error is identical to NOAA-EMC/GDASApp#1851 (comment). Is this a known issue? Any chance of a workaround fix besides disabling compute node builds for the GDASApp? |
|
Pasting the same comment I added to GDASApp issue #2016
I do not have a short term fix. I see that hotfix #4368 has already been merged into g-w |
This PR makes changes which support changes made in NOAA-EMC/global-workflow#4306 and #1999 in the following ways: 1. The obs statistics templates are moved from `algorithms/obsstats` to `model/obsstats`, since these are indeed model templates and not standalone algorithms. 2. The algorithm YAML for the JEDI obs stats application is moved from the jcb-algorithms repo to this repo in `algorithms/obsstats`. The justification is that the algorithm templates in jcb-algorithms should be for general model-agnostics JEDI applications, usually existing in the OOPS repo. This application on the other hand is specific to GDASApp. 3. Minor variable names changes are made in the obs statistics templates.
…1999) # Description The companion PR supports changes made in the NOAA-EMC/global-workflow#4306 in the following ways: 1. It consolidates the base yaml and JEDI config YAMLs for the analysis stats job into a master config YAML and adds keys for the wxflow FileHandler for staging and saving files for that job. 2. Any obs staging YAMLs are removed since that functionality is now handled by the `Jedi` class in GW. 3. COM paths are removed from all JCB base YAMLs. Now, these paths are passed as method arguments to the obs staging and diags saving methods that now exist in the `Jedi` class. This way, the JCB base YAML is strictly concerned with paths and file names for the run directories. 4. Changes are made to the JCB base YAMLs for the analysis stats job in GW to be consistent with its refactoring in the companion PR. 5. An obs list is created for the analysis stats job, determining which obs to create statistics for. Additionally, `copy` and `link` keys in various YAMLs which go into the FileHandler for several jobs become `copy_req` and `link_req`, since the former is no deprecated. # Companion PRs NOAA-EMC/global-workflow#4306 NOAA-EMC/jcb-gdas#214 NOAA-EMC/jcb-algorithms#17 NOAA-EMC/DA-utils#49 # Issues Refs NOAA-EMC/global-workflow#4224 Refs NOAA-EMC/global-workflow#4228 # Automated CI tests to run in Global Workflow CI testing will be performed as part of review for the GW companion PR. Preliminary testing has already been performed.
Description
This PR makes changes in two areas of GW code for JEDI jobs.
First, it makes changes to
ush/python/pygfs/jedi/jedi.pyrelated to observations handling that does the following:stage_observations()method for theJediclass that stages observations for analysis jobs, rather than relying on the task config YAML in GDASApp for staging. This change is justified by the fact that obs staging is essentially the same across all analysis tasks.Analysisclass into theJediclass for the following reason:Jediclass. This ensures that the file structure and naming for obs and their statistics are consistent between how they are staged in GW and how they are stages and saved by JEDI applications. Thus there can never be naming conflicts.Jediobject is created by the class constructor rather than theinitializeclass. This way,task_configdoesn't need to be passed to both the class constructor and theinitializeclass. One benefit of this is that it cuts down on the number of timestask_configis dumped by the logger.Jediclass code such as hardening and more descriptive method/variable naming.Second,
ush/python/pygfs/task/analysis_stats.pyis refactored in the following ways:AnalysisStatsclass now inherits fromAnalysisrather thanTask. This allows it to inherit parameters likeAPREFIX,GPREFIX, etc.task_configis never modified after the class constructor is run, consistent now with all other tasks.data_inanddata_outkeys).Jediobject, to ensure that file paths and naming are consistent between the GW code and JEDI application configuration YAMLs.Resolves #4224
Resolves #4228
Type of change
Change characteristics
How has this been tested?
Clone, build, and full CI suite on Hera
Checklist