Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPCD in tilted boxes fails for some domain decompositions #2009

Open
mphoward opened this issue Feb 28, 2025 · 0 comments
Open

MPCD in tilted boxes fails for some domain decompositions #2009

mphoward opened this issue Feb 28, 2025 · 0 comments
Assignees
Labels
bug Something isn't working mpcd MPCD component

Comments

@mphoward
Copy link
Collaborator

Description

@wkdarko reported that his MPCD simulation failed in a skewed box for certain domain decompositions. He provided the script below, and I reproduced the issue running on as few as 4 processors with x and y decomposition. Not all decompositions trigger the error.

I suspect that there is a precision / rounding issue setting up the overlapping cells with the domain decomposition and binning the particles. When generalizing the code to non-cubic cells, I found that the implementation was very sensitive to this. I thought I had devised a scheme that would avoid these issues, but apparently not.

To fix this, I think the MPCD domain decomposition strategy should be reworked to follow the one in this paper: https://doi.org/10.1016/j.cpc.2024.109494. Particles would be communicated to the rank that owns the cell they are binned into for the entire collision step, ensuring we don't have this issue. The paper claims this strategy is actually more efficient than the overlapping cell scheme we currently use because it is a point-to-point pattern. However, this will require a substantial effort to implement because the way the cell properties are calculated and collisions are applied also needs to change. In the end, I think that effort is worthwhile because it may help close some of the performance gap noted in this paper too.

I will prioritize working on this, and I will likely open a separate issue to flesh out the scope of work and track progress.

Script

import freud
import hoomd
import numpy as np

mpcd_density = 5
L = 20.0
tilt = [1.0, 0.0, 0.0]

device = hoomd.device.CPU()
sim = hoomd.Simulation(device=device, seed=342)

snapshot = hoomd.Snapshot()
if device.communicator.rank == 0:
    snapshot.configuration.box = [L, L, L,] + tilt
    box = freud.Box.from_box(snapshot.configuration.box)

    snapshot.mpcd.N = np.round(mpcd_density * box.volume).astype(int)
    snapshot.mpcd.types = ['A']
    snapshot.mpcd.typeid[:] = np.zeros(snapshot.mpcd.N, dtype=int)

    rng = np.random.default_rng(seed=342)
    positions = rng.uniform(low=-0.5 * L, high=0.5 * L, size=(snapshot.mpcd.N, 3))
    positions = box.wrap(positions)
    snapshot.mpcd.position[:] = positions

    vel = rng.normal(loc=0.0, scale=np.sqrt(1.0), size=(snapshot.mpcd.N, 3))
    vel -= np.mean(vel, axis=0)
    snapshot.mpcd.velocity[:] = vel
sim.create_state_from_snapshot(snapshot, domain_decomposition=[2,2,1])

integrator = hoomd.mpcd.Integrator(dt=0.1)
integrator.collision_method = hoomd.mpcd.collide.StochasticRotationDynamics(
    period=1, angle=130, kT=1.0)
integrator.mpcd_particle_sorter = hoomd.mpcd.tune.ParticleSorter(
    trigger=integrator.collision_method.period * 20)
integrator.streaming_method = hoomd.mpcd.stream.Bulk(
    period=integrator.collision_method.period)
sim.operations.integrator = integrator

sim.run(10)

Input files

mpirun -n 4 python debug.py

Output

(hoomd-dev) $ bash debug.sh
notice(2): Using domain decomposition: n_x = 2 n_y = 2 n_z = 1.
**ERROR**:  (Rank 3): MPCD particle is no longer in the simulation box
Cartesian coordinates: 
x: -0.481269 y: 9.48594 z: -2.29128
Grid shift: 
x: 0.0157772 y: -0.0215342 z: -0.0098835
Traceback (most recent call last):
  File "/home/mphoward/Documents/projects/mpcd_noncubic_cells/setup/debug/debug.py", line 42, in <module>
    sim.run(10)
  File "/home/mphoward/Documents/code/glotzerlab/build/hoomd/hoomd/simulation.py", line 562, in run
    self._cpp_sys.run(steps_int, write_at_start)
RuntimeError: Error computing cell list
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

Expected output

No error should be generated.

Platform

Linux, CPU

Installation method

Compiled from source

HOOMD-blue version

5.1.0

Python version

3.12.3

@mphoward mphoward added the bug Something isn't working label Feb 28, 2025
@mphoward mphoward self-assigned this Feb 28, 2025
@mphoward mphoward added the mpcd MPCD component label Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mpcd MPCD component
Projects
None yet
Development

No branches or pull requests

1 participant