Skip to content

Conversation

@glwagner
Copy link
Member

@glwagner glwagner commented Mar 5, 2025

@taimoorsohail curious if this is all that is needed.

Still some work to do for proper checkpointing.

@glwagner
Copy link
Member Author

glwagner commented Mar 5, 2025

Discussed on #373

@taimoorsohail
Copy link
Collaborator

taimoorsohail commented Mar 5, 2025

Do we also need:

set_time_stepper!(timestepper, args...)

to deal with variable tendencies?

https://github.com/CliMA/Oceananigans.jl/blob/3fe5e9e66b027e38cfbb3ff53b456e85b16b3bfd/src/OutputWriters/checkpointer.jl#L286

@glwagner
Copy link
Member Author

glwagner commented Mar 5, 2025

Doesn't set!(sim.model.ocean, pickup) call that?

@taimoorsohail
Copy link
Collaborator

Yep you're right, I completely missed that... I'm testing it with my code to see if it works

@glwagner
Copy link
Member Author

glwagner commented Mar 5, 2025

We could add a test here: https://github.com/CliMA/ClimaOcean.jl/blob/main/test/test_simulations.jl

that file claims to do a "time stepping test" but as far as I can tell, does not actually take a time step!

@taimoorsohail
Copy link
Collaborator

Question: If the set!() function changes the timestep of the ocean model only, doesn't the prescribed atmosphere and radiation become out-of-sync, as it will begin at (1993, 1, 1) regardless of whether a checkpoint is present or not? Whereas the ocean model will start from whatever month the checkpointer was saved at?

@glwagner
Copy link
Member Author

glwagner commented Mar 5, 2025

Question: If the set!() function changes the timestep of the ocean model only, doesn't the prescribed atmosphere and radiation become out-of-sync, as it will begin at (1993, 1, 1) regardless of whether a checkpoint is present or not? Whereas the ocean model will start from whatever month the checkpointer was saved at?

This is definitely an issue and why we implemented CliMA/Oceananigans.jl#4148

Next we need to use this feature in ClimaOcean.

I am still trying to understand whether we actually want to add the Checkpointer to the coupled simulation directly. I have to go look through the source code to see if that will work out of the box or whether more changes are needed.

Sorry lots of things are happening --- I am doing quite a few things at the same time so trying to stay afloat! Any help is very greatly appreciated.

@glwagner
Copy link
Member Author

glwagner commented Mar 5, 2025

There is also https://github.com/CliMA/ClimaOcean.jl/pull/355/files which is related, I think I will update that PR soon and work on making sure the correct time-step is taken

@taimoorsohail
Copy link
Collaborator

Thanks Greg! Sorry I'm on week 3 of learning Julia, Oceananigans and ClimaOcean so still struggling with it all. Hopefully I'll be able to make a more meaningful contribution when I understand things a bit more! In the meantime, no rush on getting things changed - I will also try to debug on my end.

@glwagner
Copy link
Member Author

glwagner commented Mar 5, 2025

Thanks Greg! Sorry I'm on week 3 of learning Julia, Oceananigans and ClimaOcean so still struggling with it all. Hopefully I'll be able to make a more meaningful contribution when I understand things a bit more! In the meantime, no rush on getting things changed - I will also try to debug on my end.

True! It could help if you work on a setup that doesn't require a checkpointer. Workflows that include checkpointing are quite slow, best for production...

@navidcy
Copy link
Member

navidcy commented Mar 6, 2025

True! It could help if you work on a setup that doesn't require a checkpointer. Workflows that include checkpointing are quite slow, best for production...

I'm not sure what you mean here. How do you run anything longer that few hours if you are not able to restart it? Or if something happens and the internet/ssh connection breaks you should be able to restart it.

I'll try to add the functionality in ClimaOcean; it'll be needed for sure. Just haven't found the time yet :(

@glwagner
Copy link
Member Author

glwagner commented Mar 6, 2025

True! It could help if you work on a setup that doesn't require a checkpointer. Workflows that include checkpointing are quite slow, best for production...

I'm not sure what you mean here. How do you run anything longer that few hours if you are not able to restart it? Or if something happens and the internet/ssh connection breaks you should be able to restart it.

I'll try to add the functionality in ClimaOcean; it'll be needed for sure. Just haven't found the time yet :(

Use tmux to ensure sessions remain if connections are broken or to keep jobs running when you need to step away.

I'm only talking about prototyping, it's true that if you want to run for 100s of years you of course need to checkpoint. Just a few years can be done without it though.

@navidcy
Copy link
Member

navidcy commented Mar 7, 2025

I'm only talking about prototyping, it's true that if you want to run for 100s of years you of course need to checkpoint. Just a few years can be done without it though.

Oh I see, gotcha. I was thinking longer runs, not playing around-prototyping.

@navidcy
Copy link
Member

navidcy commented Mar 7, 2025

Closing this in favor of #381

@navidcy navidcy closed this Mar 7, 2025
@glwagner
Copy link
Member Author

glwagner commented Mar 7, 2025

I'm only talking about prototyping, it's true that if you want to run for 100s of years you of course need to checkpoint. Just a few years can be done without it though.

Oh I see, gotcha. I was thinking longer runs, not playing around-prototyping.

I only dispute the implication that prototyping is playing around --- its not, its important.

@navidcy
Copy link
Member

navidcy commented Mar 7, 2025

Dash should be /, ie, “prototyping or playing around”! Sorry

@giordano giordano deleted the glw/pickup branch March 15, 2025 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants