Skip to content

Locality-Aware Neighbor Updates#63

Open
bienz2 wants to merge 30 commits into
mpi-advance:developfrom
bienz2:mpi-advance-develop
Open

Locality-Aware Neighbor Updates#63
bienz2 wants to merge 30 commits into
mpi-advance:developfrom
bienz2:mpi-advance-develop

Conversation

@bienz2
Copy link
Copy Markdown
Collaborator

@bienz2 bienz2 commented Mar 5, 2026

Large refactor of locality-aware neighbor collectives:

  • Changed MPIL_Request object to hold pointers to MPIL_Requests for local_L, local_S, and local_R comm. This simplified the neighbor_start/neighbor_wait routines.
  • Since MPIL_Request objects are now used, I was able to replace calls to init_communication with neighbor_alltoallv_init_standard. I deleted the file containing init_communication routines.
  • I replaced dynamic communication throughout neighbor_locality.cpp with calls to MPIL_Alltoall_crs and alltoallv_crs_personalized and combined calls when possible to reduce communication overheads of setting up locality neighborhood collectives.
  • I cleaned up code, removing all locality methods from the communication directory. CommData structs are directly within neighbor_locality.cpp now since they are not used elsewhere.

bienz2 and others added 30 commits February 4, 2026 14:22
Fix syntax for date command in traffic workflow
Added logic to fetch and output daily clone traffic data.
Run at midnight is breaking; run at 6am each day instead.
@bienz2
Copy link
Copy Markdown
Collaborator Author

bienz2 commented Mar 5, 2026

This also includes a bug fix. The suitesparse neighbor alltoallv init test was missing :
MPIL_Start(xrequest);
MPIL_Wait(xrequest, &status);
MPIL_Request_free(&xrequest);

Which caused an error (creating a persistent request) without freeing it. Tests were still passing, but if you ran the test individually, you received an error statement from MPICH.

@bienz2
Copy link
Copy Markdown
Collaborator Author

bienz2 commented Mar 5, 2026

Actually, there are two bug fixes here. The second was a missing MPI_Waitall in one implementation of alltoall_crs.

Copy link
Copy Markdown
Contributor

@TheMasterDirk TheMasterDirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave the conceptual validation of the code to you, as I just skimmed that part (and it seemed ok to me).

Most of my requested changes are just consistency fixes and/or keeping things simpler. Let me know if there are any issues, and feel free to reject any of my suggestions (just let me know if you do).

int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add an MPIL_Init here as well? I know this may not use what it sets up, but if we ever add more to the init function, might be a good idea

int* counts;
int* indices;
char* buffer;
} CommData;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the struct is still around, would you mind this setup:

  • Keep comm_data.h and comm_data.cpp but
  • Move header to "library/include/neighborhood"
  • Move source file to "library/source/neighborhood" (can sit just in that folder and not any subfolder) -- but still add it to CMakeLists in this folder
  • Delete the struct here, and comm-data specific methods here.

That way we can keep the struct in its own header (so it's easier to find where it is), and we get to keep the documentation we made?


void map_procs_to_nodes(LocalityComm* locality,
const int orig_num_msgs,
#ifdef __cplusplus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CPP header guards are only necessary around the bindings (any file that has a MPIL_Foo function) Any other header/file inside the library can skip them since the main locality_aware.h header doesn't use these files.

If there are issues with undefined references at linking when removing this, lmk and I can help trace it down.

extern "C" {
#endif

typedef struct _MPIL_Request MPIL_Request;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this necessary to avoid some compilation error?

If the compiler was unhappy about MPIL_Request* foo inside the struct, I believe you can just use _MPIL_Request* foo instead. I only ask because Andrew and I had a few headaches on resolving naming issues upstream when we did it like this, so if possible, I want to try and keep it the same.

locality_comm.cpp
topology_functions.cpp
) No newline at end of file
get_tag.cpp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I get the files in alphabetical order here? :)
Just personal preference

Comment on lines +43 to +46
// Get MPI Information
int rank, num_procs;
MPI_Comm_rank(mpil_comm->global_comm, &rank);
MPI_Comm_size(mpil_comm->global_comm, &num_procs);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Get MPI Information
int rank, num_procs;
MPI_Comm_rank(mpil_comm->global_comm, &rank);
MPI_Comm_size(mpil_comm->global_comm, &num_procs);

These aren't used in this function, might as well delete them until needed.

// Initialize final variable (MPI_Request arrays, etc.)
finalize_locality_comm(locality_comm);
// Don't need local_S or global recv indices (just contiguous)
if (local_S_recv_data->indices)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (local_S_recv_data->indices)
if (local_S_recv_data->indices != NULL)

free(local_S_recv_data->indices);
local_S_recv_data->indices = NULL;
}
if (global_recv_data->indices)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (global_recv_data->indices)
if (global_recv_data->indices != NULL)

Comment on lines +853 to +874
if (data->procs)
{
free(data->procs);
}
if (data->indptr)
{
free(data->indptr);
}
if (data->counts)
{
free(data->counts);
}
if (data->indices)
{
free(data->indices);
}
if (data->buffer)
{
free(data->buffer);
}

// Don't need local_S or global recv indices (just contiguous)
free(locality->local_S_comm->recv_data->indices);
locality->local_S_comm->recv_data->indices = NULL;
free(locality->global_comm->recv_data->indices);
locality->global_comm->recv_data->indices = NULL;
free(data);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, can these all get:

  • != NULL add to if statements
  • Variables set to NULL after freeing

// Initialize send side for dynamic communication
std::vector<int> dest(global_send_data->num_msgs);
std::vector<int> vals(global_send_data->num_msgs, mpil_comm->rank_node);
for (int i = 0; i < global_send_data->num_msgs; i++)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Add "{}" to all the one line for loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants