-
Notifications
You must be signed in to change notification settings - Fork 5
Description
I believe our routed outputs would significantly improve with appropriate t-route channel restarts. For example, our short range forecasts all look a little strange since we set all initial flows to 0. There are probably infinite ways that we could go about implementing this, but these are the ideas I have. Would love any comments and feedback before I start implementing.
Restart file requirements
The most basic restart file is a file with a dataframe that contains upstream initial flow, downstream initial flow, and depth for each reach.
Data source options
1. NWM operational hourly standard analysis/assimilation CHRTOUT files, non-self-cycling (within NRDS)
From the NWM website:
The Standard Analysis and Assimilation configuration cycles hourly and produces a real-time analysis of the current streamflow and other surface and near-surface hydrologic states across the contiguous United States (CONUS). This configuration is internally cycling, with each subsequent Standard Analysis starting from the previous hour’s run. The exception is the 19Z Standard Analysis cycle which ingests initial conditions from the Extended Analysis below. The Standard Analysis also produces restart files each hour which are used to initialize the short-, medium-, and long-range forecast simulations. Meteorological forcing data are drawn from the MRMS Gauge-adjusted and Radar-only observed precipitation products along with short-range RAP and HRRR, while stream-gauge observations are assimilated from the USGS.
The idea with non-self-cycling is that each forecast run is initialized from this external source. (As opposed to a forecast run spitting out a restart file which initializes the next forecast run, etc)
Advantages: Baked in USGS/RFC/USACE data assimilation, wouldn't be self-cycling so our model errors don't propagate
Disadvantages: Need to convert from NHD hydrofabric to NextGen hydrofabric, conversion from CHRTOUT file to t-route restart needs to happen every hour and would probably incorporate some tools from forcingprocessor, requires more infrastructure and scheduling work, places implied trust in the goodness of NWM 3.0 output. Also, CHRTOUT files do not contain depth so we would also need to do that calculation from some combination of flow, velocity, and channel geometry attributes. Some thoughts on depth calculations: jameshalgren/channel_restart_reformat#3
2. Home-grown t-route restart file, self-cycling within NRDS
If we run a decent retrospective model for a couple months? a year? or so, t-route will stabilize and we can pull the restart file from that retrospective run. Then we initialize a forecast NRDS run with that restart file, which will generate the next run's restart file, and so on.
Advantages: No hydrofabric conversion needed
Disadvantages: Need confidence in a retrospective run, self-cycling may propagate model errors, we would have to implement USGS DA separately, we can't access raw RFC/USACE data due to security concerns
3. OWP-sourced t-route restart file, self-cycling within NRDS
This is the same concept as above, except we ask OWP for an initial restart file that they are willing to give us instead of making it ourselves.
Advantages: We get an officially sanctioned restart file
Disadvantages: Hydrofabric conversion required, same issues with DA and self-cycling as in 2, relies on the goodwill of others
Restart file format
T-route is currently configured to read pickles and netCDFs. read_lite_restart reads pickles, and get_channel_restart_from_wrf_hydro reads netCDFs. I'm concerned that pickle files won't be readable across EC2 instances (might be an unfounded fear), and more concerned that users won't be able to read our pickled restarts on their own computers if they download them from S3. For this reason I think it would be best to have our restarts in netCDF format.
This does, however, require us to fill in our t-route configuration yaml file with something like this:
wrf_hydro_channel_restart_file: s3_bucket/restarts/channel_restart_02232026130000
which is kind of misleading, since it's not a WRF-Hydro restart file at all. If this semantics quibble is actually a real concern, it should be pretty easy to edit read_lite_restart to read Parquet files as well as pickles.
Implementation
Case 1 for data source options is clearly the one I've thought the most about, and so I really only have ideas for implementing this one, but 2 and 3 aren't that different.
- CIROH-UA/t-route may need minor edits
- Scheduled forcingprocessor run before any model runs
FP already has what I need to convert between hydrofabrics, read geopackage data, and use nwmurl to read NWM operational data, so it would make sense to preprocess the restart files here. - Restart file saved into S3
- Edit datastreamcli's t-route configuration generation to have the restart key point to the right filepath in S3
Metadata
Metadata
Assignees
Labels
Type
Projects
Status