Gitlab documentation#4576
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive documentation for the GitLab CI/CD pipeline infrastructure used by the global-workflow project. The documentation explains the repository mirroring strategy between GitHub and GitLab, pipeline architecture, runner deployment on RDHPCS systems, and maintenance procedures. It includes a detailed SVG architecture diagram that visually illustrates the CI/CD flow from GitHub through GitLab to the HPC runners.
Changes:
- Added comprehensive GitLab CI/CD pipeline documentation covering all aspects of the CI infrastructure
- Created an SVG architecture diagram showing repository mirroring and pipeline execution flow
- Updated documentation index to include the new CI/CD pipeline documentation
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| docs/source/index.rst | Added ci_cd_pipeline.rst to the documentation table of contents |
| docs/source/ci_cd_pipeline.rst | Comprehensive 877-line documentation covering repository mirroring, pipeline architecture, runner setup, execution details, and troubleshooting |
| docs/source/_static/ci_cd_architecture.svg | Professional SVG diagram (229 lines) illustrating the complete CI/CD architecture with GitHub, GitLab instances, pipeline stages, and RDHPCS runners |
3caaa5a
| Key Design Principles | ||
| ===================== | ||
|
|
||
| - **GitHub is authoritative**: All development happens on GitHub | ||
| (``https://github.com/NOAA-EMC/global-workflow``). GitLab is used solely as | ||
| a CI execution platform. | ||
| - **Two-tier mirroring**: A licensed GitLab instance performs the pull mirror from | ||
| GitHub, and subsequently push mirrors to the NOAA community GitLab instance. | ||
| - **HPC-native testing**: Runners execute directly on the target HPC nodes, | ||
| ensuring tests build and run against the real Spack-Stack software environment. | ||
| - **Multi-modal pipelines**: The system supports both comprehensive end-to-end | ||
| experiment cases and fast CTest-based functional checks. | ||
| - **GitHub feedback loop**: Pipeline results flow back to GitHub through PR labels, | ||
| PR comments (including error log gists), and status badges. |
There was a problem hiding this comment.
There's a lot of extraneous information in all of this documentation. Let's aim to keep this to what's important and not be too redundant.
Suggest removing this.
| Key Design Principles | |
| ===================== | |
| - **GitHub is authoritative**: All development happens on GitHub | |
| (``https://github.com/NOAA-EMC/global-workflow``). GitLab is used solely as | |
| a CI execution platform. | |
| - **Two-tier mirroring**: A licensed GitLab instance performs the pull mirror from | |
| GitHub, and subsequently push mirrors to the NOAA community GitLab instance. | |
| - **HPC-native testing**: Runners execute directly on the target HPC nodes, | |
| ensuring tests build and run against the real Spack-Stack software environment. | |
| - **Multi-modal pipelines**: The system supports both comprehensive end-to-end | |
| experiment cases and fast CTest-based functional checks. | |
| - **GitHub feedback loop**: Pipeline results flow back to GitHub through PR labels, | |
| PR comments (including error log gists), and status badges. |
There was a problem hiding this comment.
Ok, will refine.
| Pull Mirroring (Licensed GitLab Instance) | ||
| ========================================== | ||
|
|
||
| The first stage uses **pull mirroring**, a feature that is only available on | ||
| licensed (paid) tiers of GitLab (Premium or Ultimate). A single licensed GitLab | ||
| instance is configured to pull from the authoritative GitHub repository: | ||
|
|
||
| .. list-table:: Pull Mirror Configuration | ||
| :widths: 25 75 | ||
| :header-rows: 1 | ||
|
|
||
| * - Setting | ||
| - Value | ||
| * - **Source repository** | ||
| - ``https://github.com/NOAA-EMC/global-workflow.git`` | ||
| * - **Direction** | ||
| - Pull | ||
| * - **Scope** | ||
| - All branches | ||
| * - **Sync frequency** | ||
| - Automatic (every few minutes) | ||
|
|
||
| The licensed instance's sole purpose is **mirroring** — it does not run any | ||
| CI/CD pipelines itself. Its pull mirror keeps the GitLab copy synchronized with | ||
| GitHub, and its push mirror (described below) propagates changes onward. | ||
|
|
||
| .. note:: | ||
|
|
||
| Pull mirroring is an **advanced feature** available only on licensed instances | ||
| of GitLab (Premium tier and above). It is not available on GitLab Community | ||
| Edition (CE) or the free tier. This is why a separate licensed instance is | ||
| required for the first stage of the mirror chain. | ||
|
|
||
| Push Mirroring (Community GitLab at VLab) | ||
| ========================================= | ||
|
|
||
| The second stage uses **push mirroring** from the licensed GitLab instance to | ||
| the NOAA community GitLab instance hosted at VLab: | ||
|
|
||
| .. list-table:: Push Mirror Configuration | ||
| :widths: 25 75 | ||
| :header-rows: 1 | ||
|
|
||
| * - Setting | ||
| - Value | ||
| * - **Target repository** | ||
| - ``https://vlab.noaa.gov/gitlab-community/NWS/Operations/NCEP/EMC/global-workflow.git`` | ||
| * - **Direction** | ||
| - Push | ||
| * - **Scope** | ||
| - All branches | ||
| * - **Sync frequency** | ||
| - Automatic (every few minutes) | ||
|
|
||
| The VLab community GitLab instance is where the **CI/CD pipelines actually | ||
| execute**. GitLab runners deployed on RDHPCS systems register against this | ||
| instance, and all pipeline stages (build, setup, test, finalize) run here. | ||
| This instance also provides the broader NOAA user community with read access | ||
| to the repository. | ||
|
|
||
| Mirror Chain Summary | ||
| ==================== | ||
|
|
||
| The complete mirror chain is:: | ||
|
|
||
| GitHub (authoritative) | ||
| │ | ||
| │ Pull Mirror (licensed GitLab feature) | ||
| ▼ | ||
| Licensed GitLab Instance (mirroring only) | ||
| │ | ||
| │ Push Mirror (available on all GitLab tiers) | ||
| ▼ | ||
| VLab Community GitLab (CI/CD pipelines execute here, NOAA-wide access) | ||
|
|
||
| Both mirrored repositories track **all branches**, ensuring that any branch pushed | ||
| to GitHub (including PR branches fetched during pipeline execution) is available | ||
| for CI testing. | ||
|
|
||
| .. important:: | ||
|
|
||
| Developers should **never push directly** to either GitLab instance. All code | ||
| changes must flow through GitHub. The GitLab mirrors are read-only copies | ||
| maintained by the mirroring configuration. |
There was a problem hiding this comment.
I think this all can just be summarized. No need for the details here. Just
- mirror from github to VLab gitlab
- mirror from VLab gitlab to public gitlab
is enough. Having the URLs for all of these would be useful, but beyond that is extraneous.
| 1. ``cmake -S "${GW_HOMEgfs}"`` — Configure the CTest build | ||
| 2. ``ctest -N`` — List available tests | ||
| 3. ``ctest -L "${CTEST_NAME}"`` — Run tests matching a specific label | ||
| 4. JUnit XML results are published as GitLab artifacts |
There was a problem hiding this comment.
This is way to jargony. We don't need to know the internals. Please put this into laymen's terms.
| .. list-table:: Test Cases by Platform | ||
| :widths: 15 85 | ||
| :header-rows: 1 | ||
|
|
||
| * - Platform | ||
| - Test Cases | ||
| * - **Hera** | ||
| - C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA, | ||
| C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48_hybatmsnowDA, | ||
| C96C48_hybatmsoilDA, C96C48_ufsgsi_hybatmDA, C96C48_ufs_hybatmDA, | ||
| C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar, C96_gcafs_cycled, | ||
| C96_gcafs_cycled_noDA, C96mx100_S2S, C48_gsienkf_atmDA, | ||
| C48_ufsenkf_atmDA | ||
| * - **Gaea C6** | ||
| - C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA, | ||
| C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48_hybatmsnowDA, | ||
| C96C48_hybatmsoilDA, C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar, | ||
| C96_gcafs_cycled, C96_gcafs_cycled_noDA, C96mx100_S2S, | ||
| C48_gsienkf_atmDA, C48_ufsenkf_atmDA | ||
| * - **Orion** | ||
| - C48_ATM, C48_S2SW, C48_S2SWA_gefs, C96C48_hybatmDA, | ||
| C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar, C96mx100_S2S, | ||
| C96_gcafs_cycled | ||
| * - **Hercules** | ||
| - C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA, | ||
| C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48mx500_S2SW_cyc_gfs, | ||
| C96_atm3DVar, C96mx100_S2S, C96_gcafs_cycled | ||
| * - **Ursa** | ||
| - C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA, | ||
| C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48_hybatmsnowDA, | ||
| C96C48_hybatmsoilDA, C96C48_ufsgsi_hybatmDA, C96C48_ufs_hybatmDA, | ||
| C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar, C96mx100_S2S, | ||
| C96_gcafs_cycled, C96_gcafs_cycled_noDA, C48_gsienkf_atmDA, | ||
| C48_ufsenkf_atmDA |
There was a problem hiding this comment.
This doesn't belong here. This is too specific and would be a bear trying to keep updated. Just paths to where this is controlled is needed.
| Pipeline Variables | ||
| ================== | ||
|
|
||
| The following variables control pipeline behavior and can be set from | ||
| GitLab scheduled pipelines, GitHub Actions triggers, or the GitLab web UI: | ||
|
|
||
| .. list-table:: Key Pipeline Variables | ||
| :widths: 25 15 60 | ||
| :header-rows: 1 | ||
|
|
||
| * - Variable | ||
| - Default | ||
| - Description | ||
| * - ``PIPELINE_TYPE`` | ||
| - ``pr_cases`` | ||
| - Testing modality: ``pr_cases`` or ``ctests`` | ||
| * - ``GFS_CI_RUN_TYPE`` | ||
| - ``pr_cases`` | ||
| - Run classification: ``pr_cases`` or ``nightly`` | ||
| * - ``RUN_ON_MACHINES`` | ||
| - ``all`` | ||
| - Space-separated list of machines or ``all`` | ||
| * - ``PR_NUMBER`` | ||
| - ``0`` | ||
| - GitHub PR number (``0`` = develop branch) | ||
| * - ``GITHUB_COMMIT_SHA`` | ||
| - (empty) | ||
| - PR head commit SHA for GitLab native GitHub integration | ||
| * - ``GW_REPO_URL`` | ||
| - ``https://github.com/NOAA-EMC/global-workflow.git`` | ||
| - Authoritative GitHub repository URL |
There was a problem hiding this comment.
Please move this after the GitHub Actions Integration section.
| The registration command configures the runner with: | ||
|
|
||
| - **Executor**: ``shell`` (runs directly in the HPC environment) | ||
| - **Shell**: ``bash`` | ||
| - **Builds directory**: ``${GITLAB_BUILDS_DIR}`` (from platform config) | ||
| - **Custom build directory**: enabled (allowing ``.gitlab-ci.yml`` to override | ||
| the clone path via ``GIT_CLONE_PATH``) | ||
| - **Concurrency**: 24 concurrent requests | ||
|
|
||
| After registration, the script updates the runner's ``config.toml`` to set | ||
| ``concurrent = 24``. |
There was a problem hiding this comment.
Why is this important? If it isn't, please remove. If it is important, can you please further describe what concurrency is?
| └── Jenkins/ # Legacy Jenkins directories | ||
| ├── agent/ | ||
| └── workspace/ |
There was a problem hiding this comment.
Not necessary.
| └── Jenkins/ # Legacy Jenkins directories | |
| ├── agent/ | |
| └── workspace/ |
| Runner Maintenance | ||
| ================== |
There was a problem hiding this comment.
Important to note that this needs to be done on the same head node that it was initially launched on.
|
|
||
| development.rst | ||
| testing.rst | ||
| ci_cd_pipeline.rst |
There was a problem hiding this comment.
Please move this to the end.
Add Comprehensive GitLab CI/CD Pipeline Documentation
Summary
Adds a new ReadTheDocs page documenting the full GitLab CI/CD pipeline infrastructure for global-workflow. This fills a significant documentation gap — until now, the mirroring setup, pipeline architecture, runner deployment, and maintenance procedures were only known through tribal knowledge and inline code comments.
Motivation
The CI/CD infrastructure spans multiple GitLab instances, five RDHPCS platforms, and a GitHub Actions bridge, with configuration spread across ~15 files. New developers and maintainers had no single reference to understand how the pieces fit together. This documentation provides that reference.
What's Included
New Files
docs/source/ci_cd_pipeline.rstdocs/source/_static/ci_cd_architecture.svgModified Files
docs/source/index.rstci_cd_pipelineto the toctree (aftertesting)Documentation Sections
The RST document covers the following topics in depth:
trigger-gitlab-pipelines.ymlbridges GitHub PRs to GitLab pipelines, including secrets, label lifecycle (CI-Ready-to-Run→CI-{host}-Running→CI-{host}-{Pass/Fail}), and manual trigger instructionsdevelopbranch testing, badge generationlaunch_gitlab_runner.shusage (register/run/unregister), platform-specific configurations for all 5 hosts, directory layout, maintenance proceduresci_utils.shfunctions,run_check_gitlab_ci.shexperiment monitoring, timeout handlingArchitecture Diagram
The SVG diagram visually shows:
Important architectural note: The Licensed GitLab instance is used only for mirroring (pull from GitHub, push to VLab). The VLab Community GitLab instance is where the CI/CD pipelines actually execute and where runners are registered.
ReadTheDocs Preview
https://global-workflow--4576.org.readthedocs.build/en/4576/ci_cd_pipeline.html
Notes for Reviewers
gitlab-ci-hosts.yml— confirm these are current