Skip to content

Conversation

@erikvansebille
Copy link
Member

@erikvansebille erikvansebille commented Sep 4, 2025

This PR addresses #2175 by exploring whether it is possible to remove the while-loop in the _search_indices_curvilinear_2d function.

Currently, there are three breaking unit tests:

FAILED tests/v4/test_index_search.py::test_grid_indexing_fpoints - AssertionError: Expected yi 0 but got [-3]
FAILED tests/v4/test_index_search.py::test_indexing_nemo_curvilinear - AssertionError: 
FAILED tests/v4/test_xgrid.py::test_xgrid_search_cpoints[2D lon/lat] - AssertionError: assert {'X': 0, 'Y': 0, 'Z': 0} == {'X': array([...': array([0])}

Note that in the first breaking unit test, a -3 means GRID_SEARCH_ERROR, so this is a clear case where the first computation of xsi and eta failed to produce results within the [0, 1] range

@fluidnumerics-joe
Copy link
Contributor

If you get rid of the while loop, how will we handle the case where particles come in with an existing ei guess ?

@erikvansebille
Copy link
Member Author

If you get rid of the while loop, how will we handle the case where particles come in with an existing ei guess ?

Well, since the Morton encoding is so fast, we may not even need particle.ei. Note that so far in v4-dev, we don't use/update/propagate particle.ei anyways; so it wouldn't have an impact on the current runs. Indeed, if we find that there's a considerable speed-up to be gained from propagating old location, we could start using particle.ei again.

But if this PR is successful, we could save some memory overhead by not using the element index?

@fluidnumerics-joe
Copy link
Contributor

I guess it would be worth pulling runtimes for a few key curvilinear grid examples (e.g. MOi) from v4-dev and comparing.

In the meantime, I'm working on getting you up to speed on how the morton hashing works. There's a new notebook on the unstructured-mesh-parcels-sandbox repository I should be able to finish up in a few days .

The goal is to help facilitate discussion on algorithm choices around two main features of the morton hashing implementation

  • Face binning function - how faces are aligned with one or more morton codes
  • point-in-cell v. centroid distance minimization

…l check

The query method now requires a point in cell check method to be sent
in as an argument
@fluidnumerics-joe
Copy link
Contributor

@erikvansebille - this last commit here fixes the issue. To gain speed, I originally went with a quick and dirty distance minimization for the point-in-cell check. Additionally, this only required registering faces to the hash table based on morton encoding of the source grid face centers. While this is fast, and puts us "in the neighborhood", as you've found, still requires some additional searching.

Here, I've switched (back) to registering faces to the hash table by calculating all the morton codes associated with a faces bounding box. This increases, slightly, the storage requirement for the hash table, but still seems to be ok; I need to see how this does on the MOi grid.

Additionally, the query method now requires a point_in_cell method that is vectorized and can return a boolean array that indicates whether or not the point is in the cell (face). Here, I've just copied the bits from the curvilinear search to do this.

What needs to happen next (assuming we're happy with this)

  • It would be nice if the point in cell check could also return some metadata, like the computational coordinates that it found for each point in cell check. From here, I could then simply return the computational coordinations from SpatialHash.Query . Right now, we compute coordinates twice.
  • If we're good with the above point, all we really need for each grid type is a point_in_cell check method as described above. Thinking about adapting this to unstructured grids, this ends up making Morton hashing for unstructured grids #2160 a bit easier to do .

Let me know if you'd want to hop on a call Monday afternoon to talk this over a bit. Could be quicker in getting this resolved.

This ensures overlaps are properly calculated across antimeridian
@fluidnumerics-joe
Copy link
Contributor

We're back to having long construction times for the MOi grid. I'm working on incorporating land masks again

@erikvansebille
Copy link
Member Author

Let me know if you'd want to hop on a call Monday afternoon to talk this over a bit. Could be quicker in getting this resolved.

My calendar is fully blocked on Monday and Tuesday; shall we discuss this on our regular Wednesday call?

Not sure I'm a fan of landmasks though; it puts a lot of extra burden on users (who may not even have a landmask of their flow field). If we can't get the cell search faster, then let's just leave the while-loop in?

@fluidnumerics-joe
Copy link
Contributor

The search is plenty fast. It's the construction (of specifically MOi) that is painfully slow

@fluidnumerics-joe
Copy link
Contributor

I'm planning on seeing if increasing bit width helps improve initialization time.

This should effectively increase resolution of the implied hash grid and reduce the number of hits per hash cell during construction. As before, some has cells overlap thousands of elements which is what is causing the slowdown.

@fluidnumerics-joe
Copy link
Contributor

fluidnumerics-joe commented Sep 9, 2025

@erikvansebille - I've done some work to vectorize the hash table creation which significantly improves performance. On the MOi grid, it takes 8-9s (on my laptop) for initialization of the hash table, and less than a second for 1million particle search.

Right now, the computational coordinates are computed twice. We'll need to look into a strategy to only compute them once during the point-in-cell check and filter out the correct ones within query. I'm keen to see what your experience is with this new version on the MOi grid.

@VeckoTheGecko - I'd be interested to hear your thoughts on code organization here.

Let me know what you think.

@erikvansebille
Copy link
Member Author

Great work @fluidnumerics-joe! I just added performance benchmarking at Parcels-code/parcels-benchmarks#5 (comment). We get a factor 10 speedup for 5 million particles

The question is whether we consider this fast enough to drop particle.ei, or whether we want to keep the machinery to first try the old position of the particle. Perhaps a nice clean solution would be to first see if particle.ei gives a valid hit (i.e. the particle is still in the same grid cell as the last search) and if not do a Morton query?

@fluidnumerics-joe
Copy link
Contributor

Great work @fluidnumerics-joe! I just added performance benchmarking at OceanParcels/parcels-benchmarks#5 (comment). We get a factor 10 speedup for 5 million particles

Too cool! It was looking good on my side, but didn't want to claim victory until you had numbers on your system too. This is fantastic

The question is whether we consider this fast enough to drop particle.ei, or whether we want to keep the machinery to first try the old position of the particle. Perhaps a nice clean solution would be to first see if particle.ei gives a valid hit (i.e. the particle is still in the same grid cell as the last search) and if not do a Morton query?

I think that's a good idea. I'd like to have the particle in cell check return computational coordinates anyways (to cut the number of coordinate evaluations in half) and this does seem to fit nicely.

This function is now only needed within the uxgrid module
@fluidnumerics-joe
Copy link
Contributor

@erikvansebille - Did a little digging here. That peak memory consumption occurs during the hash table construction, and very little is needed for querying (see e.g. for 1000 particles). In this example, the construction is called separate

Memory usage for construction: current=1167 MB, peak=10454 MB
Memory usage for query: current=0 MB, peak=3 MB

However, I'm not able to reproduce the memory behavior with increasing particle count that you're seeing here. On my system,

Memory usage: current=1167 MB, peak=10454 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=3 MB
Memory usage: current=0 MB, peak=13 MB
Memory usage: current=0 MB, peak=26 MB
Memory usage: current=0 MB, peak=125 MB
Memory usage: current=0 MB, peak=251 MB
Memory usage: current=0 MB, peak=1242 MB
Memory usage: current=0 MB, peak=2486 MB

We're not anticipating having beyond 2^31 faces in problems for the
forseeable future.
@fluidnumerics-joe
Copy link
Contributor

The last commit uses all int32 for integers in hash table construction drops peak memory usage during initial construction from 10454 MB. The main limitation is that a mesh can't have more than 2^31 (~2 billion) faces. Since most global meshes at mesoscale resolving range have O( 1-10 million ), I think we'll be fine.

Memory usage: current=1167 MB, peak=8731 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=3 MB
Memory usage: current=0 MB, peak=12 MB
Memory usage: current=0 MB, peak=24 MB
Memory usage: current=0 MB, peak=118 MB
Memory usage: current=0 MB, peak=237 MB
Memory usage: current=0 MB, peak=1176 MB
Memory usage: current=0 MB, peak=2354 MB

@github-project-automation github-project-automation bot moved this from Backlog to Ready in Parcels development Sep 10, 2025
This eliminates the extra step needed to mask out grid search errors
This removes the need for passing the PIC method during the query call.
At the moment we're assuming that if an XGrid type comes in for the
source_grid that it is indeed curvilinear.
@fluidnumerics-joe fluidnumerics-joe marked this pull request as ready for review September 10, 2025 15:20
@fluidnumerics-joe
Copy link
Contributor

@erikvansebille - I'm curious to see how this latest version works with your benchmarks

@erikvansebille
Copy link
Member Author

@erikvansebille - I'm curious to see how this latest version works with your benchmarks

Just checked; same speed and memory footprint as yesterday. Is that to be expected?

Copy link
Contributor

@VeckoTheGecko VeckoTheGecko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No further comments from my end - though my review here has been at a more surface level. If there's anything specific you'd like to me look at I can shift some bandwidth to this PR

@fluidnumerics-joe
Copy link
Contributor

Just checked; same speed and memory footprint as yesterday. Is that to be expected?

I think 0be54db is the last major change for memory consumption. Between that and the most recent version, we've cut out the additional computational coordinate calculation.

Getting the same speed tells me that the additional coordinate calculation was not a dominant contributor to runtime. I can't say if this was expected or not without a hotspot profile .

Final remark on memory consumption though.. in testing I did find that the peak memory consumption occurs during hash table initialization (hardly any is used in query). So, if we were to look at reducing memory consumption in the future, effort ought to be focused in the initialization method for the SpatialHash class.

@VeckoTheGecko
Copy link
Contributor

VeckoTheGecko commented Sep 12, 2025

Hi @fluidnumerics-joe , is there something in specific that you would like me to review here? If so, I can get onto that next week

I'm currently focusing on FieldSet ingestion which I think needs to remain my priority so I don't know if I can commit enough bandwidth here to do a detailed here.

Have a great weekend :))

@fluidnumerics-joe
Copy link
Contributor

@VeckoTheGecko - Ive made all the requested changes. I'm really looking for approval on this at this point. I'm not too keen on starting the integration with UXGrid unless this is settled and merged into v4-dev.

Copy link
Contributor

@VeckoTheGecko VeckoTheGecko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I should have clicked "approve" and not "comment" in the prior review :)

@fluidnumerics-joe fluidnumerics-joe merged commit bacd187 into v4-dev Sep 13, 2025
9 checks passed
@fluidnumerics-joe fluidnumerics-joe deleted the curvilinear_index_search_without_while_loop branch September 13, 2025 02:44
@github-project-automation github-project-automation bot moved this from Ready to Done in Parcels development Sep 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Morton hashing still requires while loop in some cases in Curvilinear Grid

4 participants