Further cleaning the handling of grid searching errors #2177

erikvansebille · 2025-09-04T11:43:43Z

This PR addresses #2175 by exploring whether it is possible to remove the while-loop in the _search_indices_curvilinear_2d function.

Currently, there are three breaking unit tests:

FAILED tests/v4/test_index_search.py::test_grid_indexing_fpoints - AssertionError: Expected yi 0 but got [-3]
FAILED tests/v4/test_index_search.py::test_indexing_nemo_curvilinear - AssertionError: 
FAILED tests/v4/test_xgrid.py::test_xgrid_search_cpoints[2D lon/lat] - AssertionError: assert {'X': 0, 'Y': 0, 'Z': 0} == {'X': array([...': array([0])}

Note that in the first breaking unit test, a -3 means GRID_SEARCH_ERROR, so this is a clear case where the first computation of xsi and eta failed to produce results within the [0, 1] range

Chose the correct base branch (main for v3 changes, v4-dev for v4 changes)
Fixes Morton hashing still requires while loop in some cases in Curvilinear Grid #2175
Added tests
Added documentation

Note that this breaks some of the unit tests

fluidnumerics-joe · 2025-09-04T13:53:37Z

If you get rid of the while loop, how will we handle the case where particles come in with an existing ei guess ?

erikvansebille · 2025-09-04T17:27:51Z

If you get rid of the while loop, how will we handle the case where particles come in with an existing ei guess ?

Well, since the Morton encoding is so fast, we may not even need particle.ei. Note that so far in v4-dev, we don't use/update/propagate particle.ei anyways; so it wouldn't have an impact on the current runs. Indeed, if we find that there's a considerable speed-up to be gained from propagating old location, we could start using particle.ei again.

But if this PR is successful, we could save some memory overhead by not using the element index?

fluidnumerics-joe · 2025-09-04T19:13:44Z

I guess it would be worth pulling runtimes for a few key curvilinear grid examples (e.g. MOi) from v4-dev and comparing.

In the meantime, I'm working on getting you up to speed on how the morton hashing works. There's a new notebook on the unstructured-mesh-parcels-sandbox repository I should be able to finish up in a few days .

The goal is to help facilitate discussion on algorithm choices around two main features of the morton hashing implementation

Face binning function - how faces are aligned with one or more morton codes
point-in-cell v. centroid distance minimization

…l check The query method now requires a point in cell check method to be sent in as an argument

fluidnumerics-joe · 2025-09-07T22:47:37Z

@erikvansebille - this last commit here fixes the issue. To gain speed, I originally went with a quick and dirty distance minimization for the point-in-cell check. Additionally, this only required registering faces to the hash table based on morton encoding of the source grid face centers. While this is fast, and puts us "in the neighborhood", as you've found, still requires some additional searching.

Here, I've switched (back) to registering faces to the hash table by calculating all the morton codes associated with a faces bounding box. This increases, slightly, the storage requirement for the hash table, but still seems to be ok; I need to see how this does on the MOi grid.

Additionally, the query method now requires a point_in_cell method that is vectorized and can return a boolean array that indicates whether or not the point is in the cell (face). Here, I've just copied the bits from the curvilinear search to do this.

What needs to happen next (assuming we're happy with this)

It would be nice if the point in cell check could also return some metadata, like the computational coordinates that it found for each point in cell check. From here, I could then simply return the computational coordinations from SpatialHash.Query . Right now, we compute coordinates twice.
If we're good with the above point, all we really need for each grid type is a point_in_cell check method as described above. Thinking about adapting this to unstructured grids, this ends up making Morton hashing for unstructured grids #2160 a bit easier to do .

Let me know if you'd want to hop on a call Monday afternoon to talk this over a bit. Could be quicker in getting this resolved.

This ensures overlaps are properly calculated across antimeridian

fluidnumerics-joe · 2025-09-07T23:00:30Z

We're back to having long construction times for the MOi grid. I'm working on incorporating land masks again

erikvansebille · 2025-09-08T06:10:13Z

Let me know if you'd want to hop on a call Monday afternoon to talk this over a bit. Could be quicker in getting this resolved.

My calendar is fully blocked on Monday and Tuesday; shall we discuss this on our regular Wednesday call?

Not sure I'm a fan of landmasks though; it puts a lot of extra burden on users (who may not even have a landmask of their flow field). If we can't get the cell search faster, then let's just leave the while-loop in?

fluidnumerics-joe · 2025-09-08T09:58:52Z

The search is plenty fast. It's the construction (of specifically MOi) that is painfully slow

fluidnumerics-joe · 2025-09-08T10:30:51Z

I'm planning on seeing if increasing bit width helps improve initialization time.

This should effectively increase resolution of the implied hash grid and reduce the number of hits per hash cell during construction. As before, some has cells overlap thousands of elements which is what is causing the slowdown.

fluidnumerics-joe · 2025-09-09T01:12:33Z

@erikvansebille - I've done some work to vectorize the hash table creation which significantly improves performance. On the MOi grid, it takes 8-9s (on my laptop) for initialization of the hash table, and less than a second for 1million particle search.

Right now, the computational coordinates are computed twice. We'll need to look into a strategy to only compute them once during the point-in-cell check and filter out the correct ones within query. I'm keen to see what your experience is with this new version on the MOi grid.

@VeckoTheGecko - I'd be interested to hear your thoughts on code organization here.

Let me know what you think.

erikvansebille · 2025-09-09T09:25:17Z

Great work @fluidnumerics-joe! I just added performance benchmarking at Parcels-code/parcels-benchmarks#5 (comment). We get a factor 10 speedup for 5 million particles

The question is whether we consider this fast enough to drop particle.ei, or whether we want to keep the machinery to first try the old position of the particle. Perhaps a nice clean solution would be to first see if particle.ei gives a valid hit (i.e. the particle is still in the same grid cell as the last search) and if not do a Morton query?

fluidnumerics-joe · 2025-09-09T10:10:23Z

Great work @fluidnumerics-joe! I just added performance benchmarking at OceanParcels/parcels-benchmarks#5 (comment). We get a factor 10 speedup for 5 million particles

Too cool! It was looking good on my side, but didn't want to claim victory until you had numbers on your system too. This is fantastic

The question is whether we consider this fast enough to drop particle.ei, or whether we want to keep the machinery to first try the old position of the particle. Perhaps a nice clean solution would be to first see if particle.ei gives a valid hit (i.e. the particle is still in the same grid cell as the last search) and if not do a Morton query?

I think that's a good idea. I'd like to have the particle in cell check return computational coordinates anyways (to cut the number of coordinate evaluations in half) and this does seem to fit nicely.

This function is now only needed within the uxgrid module

fluidnumerics-joe · 2025-09-09T17:55:23Z

@erikvansebille - Did a little digging here. That peak memory consumption occurs during the hash table construction, and very little is needed for querying (see e.g. for 1000 particles). In this example, the construction is called separate

Memory usage for construction: current=1167 MB, peak=10454 MB
Memory usage for query: current=0 MB, peak=3 MB

However, I'm not able to reproduce the memory behavior with increasing particle count that you're seeing here. On my system,

Memory usage: current=1167 MB, peak=10454 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=3 MB
Memory usage: current=0 MB, peak=13 MB
Memory usage: current=0 MB, peak=26 MB
Memory usage: current=0 MB, peak=125 MB
Memory usage: current=0 MB, peak=251 MB
Memory usage: current=0 MB, peak=1242 MB
Memory usage: current=0 MB, peak=2486 MB

…t64)

We're not anticipating having beyond 2^31 faces in problems for the forseeable future.

fluidnumerics-joe · 2025-09-09T20:26:09Z

The last commit uses all int32 for integers in hash table construction drops peak memory usage during initial construction from 10454 MB. The main limitation is that a mesh can't have more than 2^31 (~2 billion) faces. Since most global meshes at mesoscale resolving range have O( 1-10 million ), I think we'll be fine.

Memory usage: current=1167 MB, peak=8731 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=0 MB
Memory usage: current=0 MB, peak=3 MB
Memory usage: current=0 MB, peak=12 MB
Memory usage: current=0 MB, peak=24 MB
Memory usage: current=0 MB, peak=118 MB
Memory usage: current=0 MB, peak=237 MB
Memory usage: current=0 MB, peak=1176 MB
Memory usage: current=0 MB, peak=2354 MB

parcels/spatialhash.py

parcels/_index_search.py

…lculation

This eliminates the extra step needed to mask out grid search errors

This removes the need for passing the PIC method during the query call. At the moment we're assuming that if an XGrid type comes in for the source_grid that it is indeed curvilinear.

…earch_without_while_loop

fluidnumerics-joe · 2025-09-10T15:21:18Z

@erikvansebille - I'm curious to see how this latest version works with your benchmarks

erikvansebille · 2025-09-10T15:31:38Z

@erikvansebille - I'm curious to see how this latest version works with your benchmarks

Just checked; same speed and memory footprint as yesterday. Is that to be expected?

VeckoTheGecko

No further comments from my end - though my review here has been at a more surface level. If there's anything specific you'd like to me look at I can shift some bandwidth to this PR

parcels/spatialhash.py

Co-authored-by: Nick Hodgskin <[email protected]>

fluidnumerics-joe · 2025-09-12T10:06:47Z

Just checked; same speed and memory footprint as yesterday. Is that to be expected?

I think 0be54db is the last major change for memory consumption. Between that and the most recent version, we've cut out the additional computational coordinate calculation.

Getting the same speed tells me that the additional coordinate calculation was not a dominant contributor to runtime. I can't say if this was expected or not without a hotspot profile .

Final remark on memory consumption though.. in testing I did find that the peak memory consumption occurs during hash table initialization (hardly any is used in query). So, if we were to look at reducing memory consumption in the future, effort ought to be focused in the initialization method for the SpatialHash class.

VeckoTheGecko · 2025-09-12T15:25:34Z

Hi @fluidnumerics-joe , is there something in specific that you would like me to review here? If so, I can get onto that next week

I'm currently focusing on FieldSet ingestion which I think needs to remain my priority so I don't know if I can commit enough bandwidth here to do a detailed here.

Have a great weekend :))

fluidnumerics-joe · 2025-09-12T16:59:37Z

@VeckoTheGecko - Ive made all the requested changes. I'm really looking for approval on this at this point. I'm not too keen on starting the integration with UXGrid unless this is settled and merged into v4-dev.

VeckoTheGecko

yes, I should have clicked "approve" and not "comment" in the prior review :)

erikvansebille added 2 commits September 4, 2025 09:15

Further cleaning the handling of grid searching errors

b428cd2

Removing while-loop in _search_indices_curvilinear_2d

0bce72e

Note that this breaks some of the unit tests

github-project-automation bot added this to Parcels development Sep 4, 2025

github-project-automation bot moved this to Backlog in Parcels development Sep 4, 2025

erikvansebille mentioned this pull request Sep 4, 2025

Morton hashing still requires while loop in some cases in Curvilinear Grid #2175

Closed

Switch hash table build to use source grid overlaps; add point-in-cel…

9035be9

…l check The query method now requires a point in cell check method to be sent in as an argument

Set nemo mesh type to spherical

f9e044b

This ensures overlaps are properly calculated across antimeridian

fluidnumerics-joe added 2 commits September 8, 2025 21:00

Vectorize hash table initialization

0cbbbc9

Add point_in_cell method to query call

59eecf7

Vectorize final query loop in j,i array construction

4c9661c

erikvansebille mentioned this pull request Sep 9, 2025

Adding script for benchmarking the curvilinear search function Parcels-code/parcels-benchmarks#5

Draft

Move _barycentric_coordinates function to uxgrid

9a3707f

This function is now only needed within the uxgrid module

fluidnumerics-joe added 2 commits September 9, 2025 13:55

Drop uint64 to uint32 to save memory (and dilated bits don't need uin…

3c7507c

…t64)

Drop temp variables to int32

0be54db

We're not anticipating having beyond 2^31 faces in problems for the forseeable future.

VeckoTheGecko requested changes Sep 10, 2025

View reviewed changes

parcels/spatialhash.py Outdated Show resolved Hide resolved

parcels/spatialhash.py Outdated Show resolved Hide resolved

parcels/_index_search.py Outdated Show resolved Hide resolved

github-project-automation bot moved this from Backlog to Ready in Parcels development Sep 10, 2025

Return coordinates from PIC search and remove redundant coordinate ca…

dfbb991

…lculation

fluidnumerics-joe added 6 commits September 10, 2025 10:33

Add logic for using provided guess

95cb593

Set failed query indices to GRID_SEARCH_ERROR

46c988e

This eliminates the extra step needed to mask out grid search errors

Remove "reconstruct"

8d62e20

Add _point_in_cell private method to spatialhash

dc06f63

This removes the need for passing the PIC method during the query call. At the moment we're assuming that if an XGrid type comes in for the source_grid that it is indeed curvilinear.

Merge remote-tracking branch 'origin/v4-dev' into curvilinear_index_s…

9645ea3

…earch_without_while_loop

Merge remote-tracking branch 'origin/v4-dev' into curvilinear_index_s…

29e5fe0

…earch_without_while_loop

fluidnumerics-joe marked this pull request as ready for review September 10, 2025 15:20

VeckoTheGecko reviewed Sep 11, 2025

View reviewed changes

parcels/spatialhash.py Outdated Show resolved Hide resolved

Remove initial None set for spatial hash

50f0e85

Co-authored-by: Nick Hodgskin <[email protected]>

fluidnumerics-joe requested a review from VeckoTheGecko September 12, 2025 15:00

Merge branch 'v4-dev' into curvilinear_index_search_without_while_loop

6df7035

VeckoTheGecko approved these changes Sep 12, 2025

View reviewed changes

Merge branch 'v4-dev' into curvilinear_index_search_without_while_loop

031eb46

fluidnumerics-joe merged commit bacd187 into v4-dev Sep 13, 2025
9 checks passed

fluidnumerics-joe deleted the curvilinear_index_search_without_while_loop branch September 13, 2025 02:44

github-project-automation bot moved this from Ready to Done in Parcels development Sep 13, 2025

Further cleaning the handling of grid searching errors #2177

Further cleaning the handling of grid searching errors #2177

Uh oh!

Conversation

erikvansebille commented Sep 4, 2025 • edited by fluidnumerics-joe Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fluidnumerics-joe commented Sep 4, 2025

Uh oh!

erikvansebille commented Sep 4, 2025

Uh oh!

fluidnumerics-joe commented Sep 4, 2025

Uh oh!

fluidnumerics-joe commented Sep 7, 2025

Uh oh!

fluidnumerics-joe commented Sep 7, 2025

Uh oh!

erikvansebille commented Sep 8, 2025

Uh oh!

fluidnumerics-joe commented Sep 8, 2025

Uh oh!

fluidnumerics-joe commented Sep 8, 2025

Uh oh!

fluidnumerics-joe commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erikvansebille commented Sep 9, 2025

Uh oh!

fluidnumerics-joe commented Sep 9, 2025

Uh oh!

fluidnumerics-joe commented Sep 9, 2025

Uh oh!

fluidnumerics-joe commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fluidnumerics-joe commented Sep 10, 2025

Uh oh!

erikvansebille commented Sep 10, 2025

Uh oh!

VeckoTheGecko left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fluidnumerics-joe commented Sep 12, 2025

Uh oh!

VeckoTheGecko commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fluidnumerics-joe commented Sep 12, 2025

Uh oh!

VeckoTheGecko left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

erikvansebille commented Sep 4, 2025 •

edited by fluidnumerics-joe

Loading

fluidnumerics-joe commented Sep 9, 2025 •

edited

Loading

VeckoTheGecko commented Sep 12, 2025 •

edited

Loading