Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in coll_han_alltoall.c #13115

Merged
merged 1 commit into from
Mar 6, 2025

Conversation

Petter-Programs
Copy link

Add a barrier in cases where reading memory directly from neighbors to fix an issue where the send buffer is overwritten prematurely in coll_han_alltoall.c. Please see the provided reproducer below to understand the issue a bit more. The problem is that alltoall returns before it is safe to touch the send buffer.

Bug discovered and patched at AccelCom, Barcelona Supercomputing Center.

Here is the reproducer:

#include <mpi.h>
#include <assert.h>
#include <stdio.h>

#define MPI_PROCESSES 4

#define MAX_ITERATIONS 100000

#define SPOOKY_NUMBER 9999

int main(int argc, char *argv[])
{
    MPI_Init(&argc, &argv);

    int real_world_size, my_rank;

    MPI_Comm_size(MPI_COMM_WORLD, &real_world_size);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    if(real_world_size != MPI_PROCESSES)
    {
        if(my_rank == 0)
            printf("Expected exactly %d processes.\n", MPI_PROCESSES);
        
            return 1;
    }

    int send_buff[MPI_PROCESSES];
    int recv_buff[MPI_PROCESSES];

    // Fill send_buff with ones, recv_buff with zeroes
    for(int i = 0; i<MPI_PROCESSES; i++)
    {
        send_buff[i] = 1;
        recv_buff[i] = 0;
    }

    for(int it = 0; it<MAX_ITERATIONS; it++)
    {
        MPI_Alltoall(send_buff, 1, MPI_INT, recv_buff, 1, MPI_INT, MPI_COMM_WORLD);

        // Do something else with the send buffer
        for(int i = 0; i< MPI_PROCESSES; i++)
        {
            send_buff[i] = SPOOKY_NUMBER;
        }
        
        // This should not have affected the receive buffer... right?
        for(int i = 0; i< MPI_PROCESSES; i++)
        {
            assert(recv_buff[i] != SPOOKY_NUMBER);
        }

        // Reset the send buffer back for the next iteration
        for(int i = 0; i< MPI_PROCESSES; i++)
        {
            send_buff[i] = 1;
        }

    }

    printf("Rank %d: successfully completed %d iterations!\n", my_rank, MAX_ITERATIONS);

    MPI_Finalize();
}

To run, use this:

mpirun -np 4 host_one:2,host_two:2 ./program

For me, the ssh launcher was used to do this and it automatically selected the coll_han_alltoall module.

Add a barrier in cases where reading memory directly from neighbors
to fix an issue where the send buffer is overwritten prematurely in
coll_han_alltoall.c

Bug discovered and patched at AccelCom, Barcelona Supercomputing Center

Signed-off-by: Petter Sandas <[email protected]>
@lrbison
Copy link
Contributor

lrbison commented Mar 3, 2025

Thank you for finding this issue and submitting the fix. This looks like the right fix to me.

The problem does not exist for han alltoallv, because we have a trailing barrier there: https://github.com/open-mpi/ompi/blob/main/ompi/mca/coll/han/coll_han_alltoallv.c#L956

@lrbison lrbison self-assigned this Mar 3, 2025
@lrbison
Copy link
Contributor

lrbison commented Mar 6, 2025

Tested and confirmed both the issue and the fix. Thanks again for finding this!

@lrbison lrbison merged commit 3f12346 into open-mpi:main Mar 6, 2025
15 checks passed
@lrbison
Copy link
Contributor

lrbison commented Mar 6, 2025

No need for backport, as the algorithm isn't in v5.0.x branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants