-
Notifications
You must be signed in to change notification settings - Fork 10
v0.3 dev #226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Consolidates/unifies the old dual ship_config.yaml and schedule.yaml config files into one expedition.yaml file, in line with v1 dev objectives.
Fix unnecessarily divergent branches.
…ns across multiple files
…move outdated comments
… to include get_instruments method
…edule and ShipConfig classes
|
A bit of a general comment, and tangential to this PR (I know that this PR doesn't really touch the configs - maybe for future) This is an ERD diagram for the Expedition file - in particular I'm interested in the instrument configs In the /design-doc.md we didn't really discuss the responsibility of the instrument configs - what role do they play in the software (i.e., is it possible to have multiple different configurations for the same instrument?) Perhaps in future we can define instruments to be used in the expedition, and then refer to specific instrument IDs in the waypoints (i.e., instead of open in Mermaid editor (much easier to see) ---
title: Entity Relationship Diagram for the Expedition Pydantic model
---
erDiagram
direction TB
Expedition {
Schedule schedule ""
InstrumentsConfig instruments_config ""
ShipConfig ship_config ""
}
Schedule {
list[Waypoint] waypoints ""
Optional[SpaceTimeRegion] space_time_region ""
}
InstrumentsConfig {
ArgoFloatConfig_optional argo_float_config ""
ADCPConfig_optional adcp_config ""
CTDConfig_optional ctd_config ""
CTD_BGCConfig_optional ctd_bgc_config ""
ShipUnderwaterSTConfig_optional ship_underwater_st_config ""
DrifterConfig_optional drifter_config ""
XBTConfig_optional xbt_config ""
}
ShipConfig {
float ship_speed_knots ""
}
Waypoint {
Location location ""
datetime_optional time ""
InstrumentType_optional instrument ""
}
SpaceTimeRegion {
SpatialRange spatial_range ""
TimeRange time_range ""
}
Location {
float latitude ""
float longitude ""
}
InstrumentType {
}
SpatialRange {
float minimum_longitude ""
float maximum_longitude ""
float minimum_latitude ""
float maximum_latitude ""
float_optional minimum_depth ""
float_optional maximum_depth ""
}
TimeRange {
datetime_optional start_time ""
datetime_optional end_time ""
}
ArgoFloatConfig {
float min_depth_meter ""
float max_depth_meter ""
float drift_depth_meter ""
float vertical_speed_meter_per_second ""
float cycle_days ""
float drift_days ""
}
ADCPConfig {
float max_depth_meter ""
int num_bins ""
timedelta period ""
}
CTDConfig {
timedelta stationkeeping_time ""
float min_depth_meter ""
float max_depth_meter ""
}
CTD_BGCConfig {
timedelta stationkeeping_time ""
float min_depth_meter ""
float max_depth_meter ""
}
ShipUnderwaterSTConfig {
timedelta period ""
}
DrifterConfig {
float depth_meter ""
timedelta lifetime ""
}
XBTConfig {
float min_depth_meter ""
float max_depth_meter ""
float fall_speed_meter_per_second ""
float deceleration_coefficient ""
}
Expedition||--||Schedule:"contains"
Expedition||--||InstrumentsConfig:"contains"
Expedition||--||ShipConfig:"contains"
Schedule||--|{Waypoint:"contains"
Schedule||--o|SpaceTimeRegion:"optional"
Waypoint||--||Location:"has"
Waypoint||--o|InstrumentType:"optional (single or list)"
SpaceTimeRegion||--||SpatialRange:"contains"
SpaceTimeRegion||--||TimeRange:"contains"
InstrumentsConfig||--o|ArgoFloatConfig:"optional"
InstrumentsConfig||--o|ADCPConfig:"optional"
InstrumentsConfig||--o|CTDConfig:"optional"
InstrumentsConfig||--o|CTD_BGCConfig:"optional"
InstrumentsConfig||--o|ShipUnderwaterSTConfig:"optional"
InstrumentsConfig||--o|DrifterConfig:"optional"
InstrumentsConfig||--o|XBTConfig:"optional"
|
Yes, I think this could also relate to something @iuryt , @ammedd and I were discussing offline - whether we could combine e.g. |
Co-authored-by: Erik van Sebille <[email protected]>
Co-authored-by: Erik van Sebille <[email protected]>
VeckoTheGecko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great effort and a great step in the right direction - I've left some suggestions
src/virtualship/instruments/base.py
Outdated
| def __init__( | ||
| self, | ||
| name: str, | ||
| expedition: "Expedition", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All that is accessed in the Instrument class is self.expedition.schedule.space_time_region - passing in the whole expedition is a bit overkill.
Can we just update to space_time_region
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some instruments rely on other parts of the expedition object. For example, ADCPInstrument takes:
MAX_DEPTH = self.expedition.instruments_config.adcp_config.max_depth_meter
...
NUM_BINS = self.expedition.instruments_config.adcp_config.num_binsSo, for now, I favour leaving it as it is (even if overkill for the most part). That being said, I think underway instruments (including ADCPs) can be overhauled (new Issue to follow). At which point, I think some of this can be tidied up and then we can update to just using space_time_region.
| @@ -0,0 +1,284 @@ | |||
| import abc | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import abc | |
| from __future__ import annotations | |
| import abc |
then you don't have to do "Expedition" in the type annotation
src/virtualship/instruments/base.py
Outdated
|
|
||
| def __init__( | ||
| self, | ||
| name: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name is not needed. Where you call self.name you can instead do self.__class__.__name__
| "depth": "depth", | ||
| } # same dimensions for all instruments | ||
| self.add_bathymetry = add_bathymetry | ||
| self.allow_time_extrapolation = allow_time_extrapolation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is actually used anywhere in the class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is supposed to be fed through to _generate_fieldset() - will add back in!
src/virtualship/instruments/base.py
Outdated
| self.directory = directory | ||
| self.filenames = filenames |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think directory and filenames are used in this class
| class ArgoFloatConfig(pydantic.BaseModel): | ||
| """Configuration for argos floats.""" | ||
|
|
||
| min_depth_meter: float = pydantic.Field(le=0.0) | ||
| max_depth_meter: float = pydantic.Field(le=0.0) | ||
| drift_depth_meter: float = pydantic.Field(le=0.0) | ||
| vertical_speed_meter_per_second: float = pydantic.Field(lt=0.0) | ||
| cycle_days: float = pydantic.Field(gt=0.0) | ||
| drift_days: float = pydantic.Field(gt=0.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the Pydantic models are in the models folder - this makes it clear what is configuration and what is the rest of the code. Although the instrument config is not with the instrument logic, it is with all the other config related code which is nice to easily see the structure of the overall config. If we move configuration code into the rest of the codebase we lose this separation and would significantly increase our chance of circular imports.
src/virtualship/instruments/base.py
Outdated
| spec = self.buffer_spec if spec_type == "buffer" else self.limit_spec | ||
| return spec.get(key) if spec and spec.get(key) is not None else default | ||
|
|
||
| def _find_files_in_timerange( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this a stand along function and not a method? (no dependency on self)
| from virtualship.models import Expedition | ||
|
|
||
|
|
||
| class Instrument(abc.ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in future we can further refactor out the dataset component (which I think would make it easier to work with intake etc)
|
|
||
| By default, VirtualShip will automatically 'stream' data from the Copernicus Marine Service via the [copernicusmarine toolbox](https://github.com/mercator-ocean/copernicus-marine-toolbox?tab=readme-ov-file). However, for users who wish to manage data locally, it is possible to pre-download the required datasets and feed them into VirtualShip simulations. | ||
|
|
||
| <!-- TODO: quickstart guide needs full update! --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the Quickstart guide is a priority after this PR gets merged, as we have students who will be working with VirtualShip on November 27th again
src/virtualship/errors.py
Outdated
|
|
||
| class ConfigError(RuntimeError): | ||
| class InstrumentsConfigError(RuntimeError): | ||
| """An error in the config.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "the config" refer to expedition.yaml?
src/virtualship/instruments/adcp.py
Outdated
|
|
||
| def simulate(self, measurements, out_path) -> None: | ||
| """Simulate ADCP measurements.""" | ||
| MAX_DEPTH = self.expedition.instruments_config.adcp_config.max_depth_meter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For realism I would limit this to a max of 1000 and the max from the config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean max should be hard coded to 1000 or determined from the config? Or did you mean max and min?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please determine it from the config. But if students set it way to high, let's have a limit on that. To my understanding there is a hard limit from technology and the amount of scattering material in the ocean. In my memory it was 1000m but I now find many sources that say 1300m (like https://oceanexplorer.noaa.gov/technology/acoust-doppler/) or even 1600m.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood - I'll add in a warning and a revert to default if max depth is too deep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I've left it as users being able to use their prescribed max depth even if it exceeds the authentic limit, in case it is intentional and there is a non-education/non-authentic sea-expedition related reason. The warning message will be clear in stating that it's inauthentic though and that performance may suffer.
Adjust t_min to the first day of the month based on schedule start date.
for more information, see https://pre-commit.ci
Overview
This PR introduces substantial refactoring, new features, and documentation updates to VirtualShip. Overall, the changes aim to unify configuration, centralise instrument logic and overhaul data ingestion.
Note
Update: this PR will incorporate changes intended for a v0.3 release (title now also updated) - it was previously slated to be a v1 release. Since then, we have decided to delay the v1 release until the public API is fully stable. The roadmap to a v1 is outlined via the Issue tracker with those marked with a "v1-dev" label.
Major Changes
Configuration unification
schedule.yamlandship_config.yamlare now merged into a singleexpedition.yaml. This simplifies configs and streamlines things up a bit.Instrument logic refactor
instruments/directory, using a base class and subclass structure.Instrumentclass in base.py which handles universal instrument simulation logic, and then each instrument has a sub-class (e.g.CTDInstrumentin ctd.py) which handles instrument-specific logic. This leads to a lot less repetition across the codebase.expedition/directory is now largely empty, but schedule_simulate.py remains.expedition/directory can evaporate.do_expedition.pyhas been moved to _run.py in the CLI.Data ingestion overhaul
copernicusmarinedata 'streaming'. This step takes place in thevirtualship runstep, and the previousfetchlogic is fully removed.virtualship runalso now supports linking to pre-downloaded data via a-from-dataCLI argument. This allows users to point to local or remote data stores (e.g., SURF) instead of downloading on-the-fly, if preferred.intake(Useintakefor data fetching #190) could help here to ease some of the rigidity...? @VeckoTheGecko thoughts?Performance considerations
ScipyParticle-based instruments. So this can probably be fixed in a future PR and/or when Parcels v4 is integrated.Tests
Other Considerations
space_time_regionis not feasible currently due to RAM limitations. I think this can be revisted at the stage of integration with Parcels v4.To Dos / Nice-to-Haves
simulate_schedule.pyto base/subclass structure; removeexpedition/dir.Instrumentsub-class.Please note...!
Reviewers...
@erikvansebille @VeckoTheGecko @ammedd