-
Notifications
You must be signed in to change notification settings - Fork 156
add io offload prototype #1774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add io offload prototype #1774
Conversation
vithikashah001
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs documentation for variables, subroutines and functions.
uramirez8707
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor changes, but I think this is a good start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs the apache license
test_fms/offloading/test_io_offloading.F90 and test_fms/offloading/test_metadata_transfer.F90 needs it too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The license for this file should be apache
test_fms/offloading/test_io_offloading.sh needs it too
| integer, parameter :: non_domain_decomposed = 1 !< enumeration for type of domain | ||
| integer, parameter :: Unstructured_grid = 2 !< enumeration for type of domain | ||
|
|
||
| integer, parameter :: max_files = 10 !< amount of offloaded files to allocate space for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO Make this a namelist, runtime constant, etc
| !! Registers an axis to a netcdf file on offloaded PEs. File must have been opened with open_file_offload. | ||
| interface register_axis_offload | ||
| procedure :: register_netcdf_axis_offload | ||
| procedure :: register_domain_axis_offload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO we need a register_unstructured_axis_offload for the unstructured grid
|
|
||
| !< The number of tiles must be the same as the number of offloading pes | ||
| !if (size(pe_out) .ne. ntile ) & | ||
| ! call mpp_error(FATAL, "The number of offloading PEs must be the same as the number of tiles of the domain") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be added back, and it should crash if the ntiles is not a multiple of the ntiles
| call mpp_broadcast(filename_out, str_len, pe_in(1)) | ||
| call mpp_broadcast(global_domain_size, size(global_domain_size), pe_in(1)) | ||
| call mpp_broadcast(ntile, pe_in(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO maybe we can find a way to do this all in one mpp_brodcast call (or mpp_send/mpp_receive)
| integer, allocatable :: all_current_pes(:) | ||
| integer, allocatable :: broadcasting_pes(:) | ||
| logical :: is_model_pe | ||
| character(len=255) :: var_info(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update to use the max str length variable
| if (.not. is_model_pe) then | ||
| select type(wut=>this%fileobj) | ||
| type is(FmsNetcdfDomainFile_t) | ||
| call register_axis(wut, trim(var_info(1)), trim(var_info(2))) | ||
| end select | ||
| endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's clean this up
| type(domain2D) :: domain_out | ||
| type(domain2D) :: domain_in | ||
| integer :: isc, iec, jsc, jec, nz, redistribute_clock | ||
| character(len=255) :: varname_tmp(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use max str length
| !> Structure to hold offloading file information | ||
| type :: offloading_obj_out | ||
| integer :: id | ||
| character(len=:), allocatable :: filename !< filename of the offloaded netcdf file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need to store the filename here, as opposed to getting it from fileobj%path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No looks like it can be removed, I'll add that change.
| character(len=:), allocatable :: filename !< filename of the offloaded netcdf file | ||
| class(FmsNetcdfFile_t), allocatable :: fileobj !< fms2_io file object | ||
| type(domain2D) :: domain_out !< domain on offloading PEs | ||
| integer :: type_of_domain !< type of domain (domain_decomposed, non_domain_decomposed, Unstructured_grid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem that type_of_domain is used. Is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to remove this as well.
| integer :: type_of_domain !< type of domain (domain_decomposed, non_domain_decomposed, Unstructured_grid) | ||
| end type | ||
|
|
||
| !> Offload equivalent of open_file in fms2_io_mod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems like it belongs above open_file_offload.
| type_of_domain = domain_decomposed | ||
| type is (FmsNetcdfUnstructuredDomainFile_t) | ||
| type_of_domain = Unstructured_grid | ||
| end select |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type_of_domain is assigned, but never used.
| endif | ||
|
|
||
| if (.not. is_model_pe) then | ||
| select type(wut=>this%fileobj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since FmsNetcdfDomainFile_t has no types derived from it, perhaps change fileobj from class(FmsNetcdfDomainFile_t) to type(FmsNetcdfDomainFile_t), and remove the select type.
| endif | ||
|
|
||
| if (.not. is_model_pe) then | ||
| select type(wut=>this%fileobj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above comment, regarding class(FmsNetcdfDomainFile_t) to type(FmsNetcdfDomainFile_t).
| select type(wut=>this%fileobj) | ||
| type is(FmsNetcdfDomainFile_t) | ||
| call register_field(wut, trim(var_info(1)), trim(var_info(2)), var_info(3:)) | ||
| end select |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to handle the case where fileobj is FmsNetcdfFile_t as opposed to FmsNetcdfDomainFile_t - is this case supported?
| !! This routine should be called from both the model PEs and the offload PEs, with the full list | ||
| !! of pes for each group being provided. The model PEs will broadcast the filename and domain. | ||
| subroutine open_file_offload(fileobj, filename, domain_in, pe_in, pe_out) | ||
| class(FmsNetcdfFile_t), intent(inout) :: fileobj !< fms2_io file object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some other subroutines (e.g., write_data_offload) only seem to support FmsNetcdfDomainFile_t and not the FmsNetcdfFile_t base type. If non-domain-decomposed files are unsupported, type(FmsNetcdfDomainFile_t) should probably be used consistently for fileobj arguments.
Description
draft for the io offloading changes
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Please also note
any relevant details for your test configuration (e.g. compiler, OS). Include
enough information so someone can reproduce your tests.
Checklist:
make distcheckpasses