Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI example #270

Merged
merged 19 commits into from
Mar 10, 2025
Merged

MPI example #270

merged 19 commits into from
Mar 10, 2025

Conversation

jwallwork23
Copy link
Collaborator

@jwallwork23 jwallwork23 commented Jan 30, 2025

Closes #257.

This PR creates a CPU-only example using MPI similarly to how we had the multi-GPU example set up before #268.

It's set up to run the net with different input on each MPI rank then gather the outputs to the root rank and check the outputs were correct. There's also a check that the number of MPI ranks is greater than 1 to help identify any config errors.

@jwallwork23 jwallwork23 added documentation Improvements or additions to documentation testing Related to FTorch testing labels Jan 30, 2025
@jwallwork23 jwallwork23 self-assigned this Jan 30, 2025
@jwallwork23
Copy link
Collaborator Author

jwallwork23 commented Jan 30, 2025

I decided to drop MPI in the case of Windows for now.

@jwallwork23
Copy link
Collaborator Author

[Rebased on top of main to pick up build dir move]

@jwallwork23 jwallwork23 marked this pull request as ready for review February 11, 2025 12:29
Copy link
Member

@jatkinson1000 jatkinson1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jwallwork23 I reviewed this a week ago, but just realised I had not submitted it so all my comments were marked as 'pending', apologies.

Generally looks good, but left a couple of points around best practices for you to consider.

@jwallwork23
Copy link
Collaborator Author

[Rebased on top of main.]

Copy link
Member

@jatkinson1000 jatkinson1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing those suggestions @jwallwork23

Happy for this to be merged now.

@jwallwork23 jwallwork23 merged commit edd1ebb into main Mar 10, 2025
5 of 6 checks passed
@jwallwork23 jwallwork23 deleted the 257_mpi-example branch March 10, 2025 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation testing Related to FTorch testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate MPI out of example 3
2 participants