You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This will require multiple steps to implement. I think the sequence of tasks will be:
Remove cell list data structures.
Calculate cell properties directly from particles not cell list.
Realign MPCD domain decomposition to snap with edges of cells.
Add forward and backward communication of particles to CollisionMethod.
Set a configurable buffer width for communicating MPCD particles, rather than use coverage box.
Sort particles using Hilbert or Z-order curve rather than cell list. Make a standard updater.
Consider removing / refactoring cell list to live in collision method.
Benchmark with both single and double particle data, and consider using a mixed-precision data model for MPCD particles.
Motivation and context
The supplementary material of https://doi.org/10.1016/j.cpc.2024.109494 (a similar benchmark / issue as discussed in #1887) shows that HOOMD underperforms their implementation on both CPUs and GPUs. On the CPU, HOOMD has better strong scaling but lower performance for most core counts. On the GPU, there is also a performance difference, but the strong scaling is now abysmal. Based on the implementation presented in Section 3, I think the main likely differences are:
Mixed precision particle data (they use float positions & double velocities). I reproduced their benchmark numbers using a fully double build of HOOMD. The impact of this could be assessed by rerunning with MPCD mixed precision.
Their cell property calculation is done using atomic operations, which bypasses the cell list. This should lead to fewer memory transactions, even if they are more scattered, and MPCD is typically quite memory bound.
They communicate MPCD particles to the rank owning the collision cell, then backward communicate the effect of the collision. This pattern may have less communication overhead than HOOMD's strategy to communicate overlapping cells because it is point-to-point. It also means that particles can drift farther before migration.
There are other benefits to refactoring as proposed above:
Lower memory demands, especially for large systems or those with boundaries.
Description
This will require multiple steps to implement. I think the sequence of tasks will be:
CollisionMethod
.Motivation and context
The supplementary material of https://doi.org/10.1016/j.cpc.2024.109494 (a similar benchmark / issue as discussed in #1887) shows that HOOMD underperforms their implementation on both CPUs and GPUs. On the CPU, HOOMD has better strong scaling but lower performance for most core counts. On the GPU, there is also a performance difference, but the strong scaling is now abysmal. Based on the implementation presented in Section 3, I think the main likely differences are:
There are other benefits to refactoring as proposed above:
The text was updated successfully, but these errors were encountered: