Skip to content

Perform forecast output copies to COM using MPMD#4575

Draft
Copilot wants to merge 15 commits intodevelopfrom
copilot/perform-parallel-copies-com
Draft

Perform forecast output copies to COM using MPMD#4575
Copilot wants to merge 15 commits intodevelopfrom
copilot/perform-parallel-copies-com

Conversation

Copy link
Contributor

Copilot AI commented Feb 20, 2026

Description

Forecast output copy functions in ush/forecast_postdet.sh perform serial cpfs loops that can take several minutes at high resolution (e.g., GDAS FV3 restarts: ~31 files × N restart dates). Replace all serial copy loops with MPMD-backed parallel execution using run_mpmd.sh.

Changes

  • FV3_out: Builds a cmdfile with one cpfs src dest per tile/date combination; calls run_mpmd.sh. Most impactful — GDAS at C384 produces 30+ files per restart date across multiple dates.
  • WW3_out, MOM6_out, CICE_out, CMEPS_out: Same pattern; guarded with -s check since copies are conditional and the cmdfile may be empty.
  • GOCART_out: Up to 16 file types × many forecast hours; existence-conditional copies are added to cmdfile before MPMD dispatch.

Pattern

Follows the existing run_mpmd.sh convention used in regrid_gsiSfcIncr_to_tile.sh, exglobal_atmos_analysis.sh, and exglobal_diag.sh:

local cmdfile="${DATA}/cmdfile_fv3_out"
rm -f "${cmdfile}"
for restart_date in "${restart_dates[@]}"; do
    for fv3_file in ${file_list}; do
        echo "cpfs ${DATArestart}/FV3_RESTART/${restart_date}.${fv3_file} ${COMOUT_ATMOS_RESTART}/${restart_date}.${fv3_file}" >> "${cmdfile}"
    done
done

"${USHgfs}/run_mpmd.sh" "${cmdfile}" && true
export err=$?
if [[ ${err} -ne 0 ]]; then
    err_exit "run_mpmd.sh failed to copy FV3 restart files!"
fi

When USE_CFP=NO, run_mpmd.sh falls back to serial execution automatically — no behavior change in non-CFP environments.

Resolves #2865

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs (e.g. value changes to existing outputs, new files stored in COM, files removed from COM, filename changes, additions/subtractions to archives)? NO
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary
Original prompt

This section details on the original issue you should resolve

<issue_title>Perform copies to COM using MPMD for forecast outputs</issue_title>
<issue_description>### What new functionality do you need?

The gdasfcst job creates a suite of restart files that need to be copied back to COM for the next cycle's forecasts. This process can take several minutes at high resolution. It would be better if this could be done in a parallel manner using MPMD. This could be expanded to the forecast outputs in general if that is easier.

Acceptance Criteria

  • Copies are performed in parallel with run_mpmd.sh as the backbone
  • Parallel copies are faster than serial

Suggest a solution (optional)

Implement run_mpmd.sh in each of the *_out functions (FV3_out, MOM6_out, etc) in forecast_postdet.sh.</issue_description>

<agent_instructions>This work would be done in the ush/forecast_postdet.sh script within the FV3_out, WW3_out, etc, functions. These functions are called at the end of dev/scripts/exglobal_forecast.sh. Use the dev/run_mpmd.sh script for the backend of the MPMD process. See e.g. dev/scripts/exglobal_atmos_analysis.sh and dev/scripts/exglobal_diag.sh for examples on how to use run_mpmd.sh.</agent_instructions>

Comments on the Issue (you are @copilot in this section)


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Copilot AI changed the title [WIP] Implement parallel copies to COM using MPMD for forecasts Perform forecast output copies to COM using MPMD Feb 20, 2026
@emcbot emcbot added CI-Gaeac6-Ready **CM use only** PR is ready for CI testing on Gaea C6 CI-Ursa-Ready **CM use only** PR is ready for CI testing on Ursa CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa CI-Gaeac6-Building **Bot use only** CI testing is cloning/building on Gaea C6 CI-Gaeac6-Running **Bot use only** CI testing on Gaea C6 for this PR is in-progress CI-Ursa-Running **Bot use only** CI testing on Ursa for this PR is in-progress CI-Gaeac6-Passed **Bot use only** CI testing on Gaea C6 for this PR has completed successfully and removed CI-Ursa-Ready **CM use only** PR is ready for CI testing on Ursa CI-Gaeac6-Ready **CM use only** PR is ready for CI testing on Gaea C6 CI-Gaeac6-Building **Bot use only** CI testing is cloning/building on Gaea C6 CI-Ursa-Building **Bot use only** CI testing is cloning/building on Ursa CI-Gaeac6-Running **Bot use only** CI testing on Gaea C6 for this PR is in-progress labels Feb 20, 2026
@TerrenceMcGuinness-NOAA
Copy link
Collaborator

TerrenceMcGuinness-NOAA commented Feb 23, 2026

The GitLab CI Pipeline stalled on Ursa this PR on Friday evening. It is not running nominally form a Monday morning nudge from within the GitLab controller.

@DavidHuber-NOAA
Copy link
Contributor

This change worked on Gaea C6 and I have verified that the contents are identical to the nightly run of a C96C48mx500_cyc_gfs test case. Opening for review.

@DavidHuber-NOAA DavidHuber-NOAA marked this pull request as ready for review February 23, 2026 17:52
@DavidHuber-NOAA DavidHuber-NOAA removed the request for review from aerorahul February 23, 2026 17:52
@emcbot emcbot removed the CI-Ursa-Running **Bot use only** CI testing on Ursa for this PR is in-progress label Feb 23, 2026
@emcbot emcbot added CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Feb 25, 2026
@emcbot
Copy link

emcbot commented Feb 25, 2026

C48mx500_3DVarAOWCDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C48mx500_3DVarAOWCDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48mx500_3DVarAOWCDA_733bbf3e-9636/logs/2021032418/gdas_fcst_seg0.log

View Error Logs: (gdas_fcst_seg0.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C48_gsienkf_atmDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C48_gsienkf_atmDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48_gsienkf_atmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48_gsienkf_atmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem002.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Feb 25, 2026
@emcbot
Copy link

emcbot commented Feb 25, 2026

C96C48_hybatmsnowDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96C48_hybatmsnowDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_hybatmsnowDA_733bbf3e-9636/logs/2021122012/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_hybatmsnowDA_733bbf3e-9636/logs/2021122012/enkfgdas_fcst_mem002.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96C48_ufsgsi_hybatmDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96C48_ufsgsi_hybatmDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_ufsgsi_hybatmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_ufsgsi_hybatmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem002.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96C48_hybatmDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96C48_hybatmDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_hybatmDA_733bbf3e-9636/logs/2021122018/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_hybatmDA_733bbf3e-9636/logs/2021122018/enkfgdas_fcst_mem002.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96C48mx500_S2SW_cyc_gfs FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96C48mx500_S2SW_cyc_gfs_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48mx500_S2SW_cyc_gfs_733bbf3e-9636/logs/2021122012/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48mx500_S2SW_cyc_gfs_733bbf3e-9636/logs/2021122012/enkfgdas_fcst_mem002.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96_gcafs_cycled FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96_gcafs_cycled_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96_gcafs_cycled_733bbf3e-9636/logs/2021122012/gcdas_fcst_seg0.log

View Error Logs: (gcdas_fcst_seg0.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96_atm3DVar FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96_atm3DVar_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96_atm3DVar_733bbf3e-9636/logs/2021122018/gdas_fcst_seg0.log

View Error Logs: (gdas_fcst_seg0.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C48_ufsenkf_atmDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C48_ufsenkf_atmDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48_ufsenkf_atmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48_ufsenkf_atmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem002.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96C48_hybatmsoilDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96C48_hybatmsoilDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_hybatmsoilDA_733bbf3e-9636/logs/2022051506/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_hybatmsoilDA_733bbf3e-9636/logs/2022051506/enkfgdas_fcst_mem002.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C48mx500_hybAOWCDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C48mx500_hybAOWCDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_733bbf3e-9636/logs/2021032418/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_733bbf3e-9636/logs/2021032418/enkfgdas_fcst_mem002.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C48mx500_hybAOWCDA_733bbf3e-9636/logs/2021032418/gdas_fcst_seg0.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log) (gdas_fcst_seg0.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96C48_ufs_hybatmDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96C48_ufs_hybatmDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem001.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_733bbf3e-9636/logs/2024022318/enkfgdas_fcst_mem002.log
/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_733bbf3e-9636/logs/2024022318/gdas_fcst_seg0.log

View Error Logs: (enkfgdas_fcst_mem001.log) (enkfgdas_fcst_mem002.log) (gdas_fcst_seg0.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@emcbot
Copy link

emcbot commented Feb 25, 2026

C96_gcafs_cycled_noDA FAILED on Hera (pipeline ID: 9636)

In directory: /scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/EXPDIR/C96_gcafs_cycled_noDA_733bbf3e-9636

Error Log Files:


/scratch3/NCEPDEV/global/role.glopara/GFS_CI_CD/HERA/BUILDS/GITLAB/pr_cases_4575_733bbf3e_9636/RUNTESTS/COMROOT/C96_gcafs_cycled_noDA_733bbf3e-9636/logs/2021122012/gcdas_fcst_seg0.log

View Error Logs: (gcdas_fcst_seg0.log)

This failure was detected automatically by global-workflow's CI/CD Pipeline

@DavidHuber-NOAA DavidHuber-NOAA removed the CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed label Feb 25, 2026
DavidHuber-NOAA and others added 5 commits February 27, 2026 13:52
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…-emc/global-workflow into copilot/perform-parallel-copies-com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Perform copies to COM using MPMD for forecast outputs gdasstage_ic and gdasfcst_seg0 disagree on staged filenames for ocean restarts

5 participants