Partial fixes for multisite ensembles #3654
Open
+52
−45
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
A couple small fixes and a questionable hack for running ensemble and uncertainty analyses without database access.
SA-median/
isn't overwritten for each site in turnNote that the latter step is done independently for each call to
write.*.configs
, so in a multisite run this will effectively set up a separate ensemble/SA for each site. This was what I wanted today, but I suspect most people will want outputs aggregated across sites, which this PR does not implement.Motivation and Context
For the MAGiC project I wanted to quickly evaluate AGB timeseries from many sites, for which the timeseries plots from the ensemble analysis would be perfect except that I'm running with no Bety access and the existing code sets the ensemble ID to
NOENSEMBLEID
, making each site overwrite the outputs from the previous one.Since the issue applies to both ensemble and sensitivity I tried to implement a fix for both, but note that I focused on avoiding collisions between distinct ensembles -- there are still places where two sites with the same ensemble ID will overwrite each other.
I'm pasting my wokring notes below -- @divine7022 and @dlebauer will likely want to consider the unresolved issues in their work on multisite sensitivity.
write.configs fails if SA is requested in a settings with ensemble size > 1
=> unresolved
(minor): README.txt does not specify which met/IC/soil/event inputs were used
=> unresolved
rundir
SA-<pft>-<var>-<quantile>
contents are overwritten by each site in turn=> Resolved by adding site id to the get.run.id call
rundir SA-median- tries to run analysis for "ALL PFT", fails on NAs from pfts not present at that site
=> Resolved by having run.sensitivity.analysis subset PFTs to those in run$site$site.pft. PFT doesn't show up in rundir names, but since only one per site it works.
each site's call to run.write.configs overwrites
sensitivity.samples
(But I think this is where run IDs are taken from)
=> Unresolved
runModule.run.sensitivity.analysis overwrites outputs as it runs for each site
=> This workaround implemented by setting null ensemble.ids to
rlang::hash(settings)
,but if we might consider settings$run$site$id instead. Are there cases where a multiSettings might contain multiple entries from the same site? or where site ID would be unset?
Review Time Estimate
Types of changes
Checklist: