-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/obs interpolated position #104
base: main
Are you sure you want to change the base?
Feat/obs interpolated position #104
Conversation
Looks like the CI test pipeline fails because of something else than this code for now:
|
(@gauteh are you familiar with this CI error? looks like this is something standard? something on trajan CI in general that could be updated? how do you want to proceed? If you push a fix to main I can rebase on it :) ) |
Fixes #103 . |
Hi Jean, I'm looking at this now and thinking about how to do this so that we get a consistent set of features and methods in TrajAn. It seems that this functionality is very similar to that of The other question is how much post-processed data to include in these files. The goal should not be a one-liner in trajan in itself. You have to call that one-liner from some python-script anyway, so when you generate data for a particular end-user or purpose it is better to add the specific adaptations there as another line or two and keep the default in trajan as basic as possible. That will make the methods easier to compose into blocks for different purposes. We see in OpenDrift that adding a lot of particular functions or arguments to functions is a never-ending game, the next use case is slightly different and a new function or argument is needed. Better to have the user write a few lines, than one very specific one. |
Hi! Thus small and well defined methods with very specific and clear task and input-output is desired, rather than larger methods which are very convenient ("oneliner") for a specific task, but just slightly unusable for similar tasks. |
(note to myself: this PR currently breaks |
Hi again! :) Sorry it took a bit of time for me to come back to you! I understand your comments about code duplication :) . a few thoughts
and related uses of the This will avoid, for example, to automatically interpolate a sensor that had been working intermittently. So instead of interpolating in case of missing data, NaNs will be introduced: I do not think that there is a way to do this with the scipy's Would this kind of functionality be interesting for trajan? I agree that this is overkill for, e.g., model data output, but for actual buoy data, this may be useful?
# %%
# generate a trajan dataset
path_to_test_data = Path.cwd().parent / "tests" / "test_data" / "csv" / "omb3.csv"
xr_buoys = read_omb_csv(path_to_test_data)
# %%
xr_buoys
# %%
xr_buoys.traj.gridtime('15min')
# %% gives:
and
possible directions for PRs
|
if working in this direction, I could start by implementing this only for "scalar" quantities, not e.g. 1D spectra fields. |
Another consideration: this is actually something that is maybe more the responsibility of the OMB decoder than of trajan. Since the OMb decoder is part of trajan to allow to use it as a reader (which is very convenient, and I would like this to continue if you are ok with this :) ), would you be more willing that I still include this functionality, but moving the functions etc. fully to the omb decoder related files, so that this is "contained" there? This way, this function would not really be the responsibility of core trajan / "you". Let me know if this would be an acceptable solution - if so, I can easily adapt the PR :) . |
I don't have the complete overview, but here are my current thoughts so that you don't have to wait for too long for some feedback:
I am wondering if maybe we should rather try to make an To summarize:
|
My concern is that I do not want to perform "brutal, blind" interpolation in my case, as we may easily have holes in the timeseries in particular for buoys with different sensors sending data over iridium, so interpolating "brutally" may lead to silently generating non-sense data. Therefore, given that users ask for interpolated position at the time of the spectra for the OMB, I have to perform interpolation with a "careful interpolator". I asked to scipy (I think the "import chain" is trajan <- xarray <- scipy <- numpy, but that the numpy is too "fundamental" for adding lots of options) if this could be implemented "in general", but the response was that this is too use-case specific, so that it should be implemented ad-hoc by the user: This is why I would like to implement a separate interpolation function, that can be used to interpolate when there is a risk that there are holes in the data (like for buoys and other observations, and in particular in the OMB decoder to answer the user's request). I agree that if working with model data on a regular time (or space) grid, when data are of "perfect" quality (in the sense that there are never holes or missing data), then the functions that you mention and are already implemented work fine and it is fine to use these already existing functions to perform interpolation. But I think that the case I have here is a bit different, due to this holes issues, and that this will necessarily be a bit ad hoc :) . My hope is that the way I have written: means that it should be easy to re-use other places in decoders / workflows where the same "careful interpolation in case of holes" use case arises :) . |
Thanks for the PR #136 showing how to do something similar with existing tooling @gauteh ! :) I think that there are actually 2 different questions here:
So I think that we talk about slightly different things: your example focuses on the first point, but makes no guarantee about the second point. My PR here is actually mostly interested in the second point (though I also naturally have to consider technicalities so have some technical aspects to make the first point work too :) ). |
Is the goal to interpolate / grid to a grid (like gridtime) or to the observation times of the IMU? |
The goal is to interpolate "the GPS position at the GPS times" to "the GPS position at the wave spectra times", and doing so without risking to do a wild invalid interpolation in case there are some holes in the data :) . |
Ok. I think this should go in a I can push up a small skeleton, I also added an On another note, the example_indepth_... example is getting pretty long, meaning it is easy to mess it up when changing unrelated parts of the code. Maybe it's better to have shorter example? Makes it easier for the user to find exactly what they are looking for (even if it means some duplication). |
59e268f
to
54b5c7d
Compare
Following recurrent user requests, this PR: