Skip to content

Add JCB options for writing jdiag files in parallel#530

Merged
ShunLiu-NOAA merged 1 commit intoNOAA-EMC:developfrom
SamuelDegelia-NOAA:feature/io_pool_jcb
Jan 30, 2026
Merged

Add JCB options for writing jdiag files in parallel#530
ShunLiu-NOAA merged 1 commit intoNOAA-EMC:developfrom
SamuelDegelia-NOAA:feature/io_pool_jcb

Conversation

@SamuelDegelia-NOAA
Copy link
Contributor

@SamuelDegelia-NOAA SamuelDegelia-NOAA commented Jan 30, 2026

Description

This PR adds two new JCB configuration options to control how jdiag files are written in fv3-jedi. These options are intended to significantly reduce jdiag write time in rrfs-workflow, primarily for na3km.

When including the full observation set on na3km, writing the jdiag files can take more than 20 minutes in total. The following options are introduced to address this bottleneck:

max pool size: sets the number of MPI tasks used for writing jdiag files
write multiple files: allows each MPI task to write its own portion of the jdiag output rather than aggregating to a single file

Using max pool size: 80 together with write multiple files: true reduces jdiag write time to approximately 1 to 2 minutes for the na3km domain.

Note that the default values for these options (max pool size: 1 and write multiple files: false) match the current behavior when the options are not specified. As a result, existing configurations and ctests are unaffected unless the new options are explicitly set in the JCB config file.

Issue(s) addressed

None

Dependencies (if applicable)

None

Checklist

  • I have performed a self-review of my own code.
  • I have run rrfs tests before creating the PR (if applicable).
  • Unit tests added/updated (if applicable).

Copy link
Collaborator

@delippi delippi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very basic change. A lot of files! Looks good!

@ShunLiu-NOAA ShunLiu-NOAA merged commit 5ea15b4 into NOAA-EMC:develop Jan 30, 2026
1 check passed
@SamuelDegelia-NOAA SamuelDegelia-NOAA deleted the feature/io_pool_jcb branch January 30, 2026 15:30
@rrfsbot
Copy link
Collaborator

rrfsbot commented Jan 30, 2026

PASSED on hera

started build_and_test on hera at UTC time: Fri Jan 30 15:05:06 UTC 2026
finished at UTC time: Fri Jan 30 15:34:56 UTC 2026

Test project /scratch3/NCEPDEV/fv3-cam/rrfsbot/PRs_RDASApp/530/build/rrfs-test
      Start  6: rrfs_fv3jedi_2024052700_getkf_observer
      Start 15: rrfs_mpasjedi_2024052700_getkf_observer
      Start  1: rrfs_fv3jedi_2024052700_3dvar
      Start  2: rrfs_fv3jedi_2024052700_3denvar
      Start  3: rrfs_fv3jedi_2024052700_3denvar_mgbf
      Start  4: rrfs_fv3jedi_2024052700_hybrid3denvar
      Start  5: rrfs_fv3jedi_2024052700_hybrid3denvar_mgbf
      Start  8: rrfs_fv3jedi_2024052700_3dvar_conv_surface
 1/18 Test  #1: rrfs_fv3jedi_2024052700_3dvar .................   Passed   30.30 sec
      Start  9: rrfs_fv3jedi_2024052700_3dvar_conv_upperair
 2/18 Test  #8: rrfs_fv3jedi_2024052700_3dvar_conv_surface ....   Passed   55.90 sec
      Start 10: rrfs_fv3jedi_2024052700_3dvar_remote
 3/18 Test  #6: rrfs_fv3jedi_2024052700_getkf_observer ........   Passed   67.99 sec
      Start  7: rrfs_fv3jedi_2024052700_getkf_solver
 4/18 Test  #9: rrfs_fv3jedi_2024052700_3dvar_conv_upperair ...   Passed   45.90 sec
      Start 11: rrfs_fv3jedi_2024052700_3dvar_satrad
 5/18 Test #10: rrfs_fv3jedi_2024052700_3dvar_remote ..........   Passed   20.52 sec
      Start 12: rrfs_fv3jedi_2024052700_3denvar_refl
 6/18 Test  #7: rrfs_fv3jedi_2024052700_getkf_solver ..........   Passed   54.93 sec
      Start 13: rrfs_mpasjedi_2024052700_bumploc
 7/18 Test  #2: rrfs_fv3jedi_2024052700_3denvar ...............   Passed  129.50 sec
      Start 14: rrfs_mpasjedi_2024052700_3denvar
 8/18 Test #11: rrfs_fv3jedi_2024052700_3dvar_satrad ..........   Passed   61.02 sec
      Start 17: rrfs_mpasjedi_2024052700_3dvar
 9/18 Test  #4: rrfs_fv3jedi_2024052700_hybrid3denvar .........   Passed  172.05 sec
      Start 18: rrfs_bufr2ioda_msonet
10/18 Test #17: rrfs_mpasjedi_2024052700_3dvar ................   Passed   54.23 sec
11/18 Test #18: rrfs_bufr2ioda_msonet .........................   Passed   28.21 sec
12/18 Test  #3: rrfs_fv3jedi_2024052700_3denvar_mgbf ..........   Passed  226.22 sec
13/18 Test  #5: rrfs_fv3jedi_2024052700_hybrid3denvar_mgbf ....   Passed  243.32 sec
14/18 Test #15: rrfs_mpasjedi_2024052700_getkf_observer .......   Passed  272.26 sec
      Start 16: rrfs_mpasjedi_2024052700_getkf_solver
15/18 Test #14: rrfs_mpasjedi_2024052700_3denvar ..............   Passed  261.15 sec
16/18 Test #13: rrfs_mpasjedi_2024052700_bumploc ..............   Passed  309.06 sec
17/18 Test #16: rrfs_mpasjedi_2024052700_getkf_solver .........   Passed  184.23 sec
18/18 Test #12: rrfs_fv3jedi_2024052700_3denvar_refl ..........   Passed  464.84 sec

100% tests passed, 0 tests failed out of 18

Label Time Summary:
mpi            = 2681.63 sec*proc (18 tests)
rdas-bundle    = 2681.63 sec*proc (18 tests)
script         = 2681.63 sec*proc (18 tests)

Total Test time (real) = 541.33 sec

workdir: /scratch3/NCEPDEV/fv3-cam/rrfsbot/PRs_RDASApp/530

@SamuelDegelia-NOAA
Copy link
Contributor Author

PASSED on wcoss2

started build_and_test on wcoss2 at UTC time: Fri Jan 30 14:58:51 UTC 2026
finished at UTC time: Fri Jan 30 15:53:33 UTC 2026

Test project /lfs/h2/emc/da/noscrub/samuel.degelia/rrfsbot/PRs_RDASApp/530/build/rrfs-test
      Start  6: rrfs_fv3jedi_2024052700_getkf_observer
      Start 15: rrfs_mpasjedi_2024052700_getkf_observer
      Start  1: rrfs_fv3jedi_2024052700_3dvar
      Start  2: rrfs_fv3jedi_2024052700_3denvar
      Start  3: rrfs_fv3jedi_2024052700_3denvar_mgbf
      Start  4: rrfs_fv3jedi_2024052700_hybrid3denvar
      Start  5: rrfs_fv3jedi_2024052700_hybrid3denvar_mgbf
      Start  8: rrfs_fv3jedi_2024052700_3dvar_conv_surface
      Start  9: rrfs_fv3jedi_2024052700_3dvar_conv_upperair
      Start 10: rrfs_fv3jedi_2024052700_3dvar_remote
 1/18 Test #10: rrfs_fv3jedi_2024052700_3dvar_remote ..........   Passed   77.80 sec
      Start 11: rrfs_fv3jedi_2024052700_3dvar_satrad
 2/18 Test  #1: rrfs_fv3jedi_2024052700_3dvar .................   Passed   87.80 sec
      Start 12: rrfs_fv3jedi_2024052700_3denvar_refl
 3/18 Test  #9: rrfs_fv3jedi_2024052700_3dvar_conv_upperair ...   Passed  101.78 sec
      Start 13: rrfs_mpasjedi_2024052700_bumploc
 4/18 Test  #8: rrfs_fv3jedi_2024052700_3dvar_conv_surface ....   Passed  102.69 sec
      Start 14: rrfs_mpasjedi_2024052700_3denvar
 5/18 Test  #6: rrfs_fv3jedi_2024052700_getkf_observer ........   Passed  133.73 sec
      Start  7: rrfs_fv3jedi_2024052700_getkf_solver
 6/18 Test #11: rrfs_fv3jedi_2024052700_3dvar_satrad ..........   Passed  127.03 sec
      Start 17: rrfs_mpasjedi_2024052700_3dvar
 7/18 Test  #2: rrfs_fv3jedi_2024052700_3denvar ...............   Passed  242.72 sec
      Start 18: rrfs_bufr2ioda_msonet
 8/18 Test  #4: rrfs_fv3jedi_2024052700_hybrid3denvar .........   Passed  246.76 sec
 9/18 Test #18: rrfs_bufr2ioda_msonet .........................   Passed   34.36 sec
10/18 Test  #7: rrfs_fv3jedi_2024052700_getkf_solver ..........   Passed  149.17 sec
11/18 Test  #3: rrfs_fv3jedi_2024052700_3denvar_mgbf ..........   Passed  291.70 sec
12/18 Test  #5: rrfs_fv3jedi_2024052700_hybrid3denvar_mgbf ....   Passed  300.74 sec
13/18 Test #17: rrfs_mpasjedi_2024052700_3dvar ................   Passed  106.95 sec
14/18 Test #13: rrfs_mpasjedi_2024052700_bumploc ..............   Passed  346.98 sec
15/18 Test #15: rrfs_mpasjedi_2024052700_getkf_observer .......   Passed  454.75 sec
      Start 16: rrfs_mpasjedi_2024052700_getkf_solver
16/18 Test #14: rrfs_mpasjedi_2024052700_3denvar ..............   Passed  458.13 sec
17/18 Test #12: rrfs_fv3jedi_2024052700_3denvar_refl ..........   Passed  651.99 sec
18/18 Test #16: rrfs_mpasjedi_2024052700_getkf_solver .........   Passed  306.13 sec

100% tests passed, 0 tests failed out of 18

Label Time Summary:
rdas-bundle    = 4221.22 sec*proc (18 tests)
script         = 4221.22 sec*proc (18 tests)

Total Test time (real) = 760.94 sec

workdir: /lfs/h2/emc/da/noscrub/samuel.degelia/rrfsbot/PRs_RDASApp/530

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants