Skip to content

Gitlab documentation#4576

Open
TerrenceMcGuinness-NOAA wants to merge 7 commits intoNOAA-EMC:developfrom
TerrenceMcGuinness-NOAA:gitlab_documentation
Open

Gitlab documentation#4576
TerrenceMcGuinness-NOAA wants to merge 7 commits intoNOAA-EMC:developfrom
TerrenceMcGuinness-NOAA:gitlab_documentation

Conversation

@TerrenceMcGuinness-NOAA
Copy link
Collaborator

@TerrenceMcGuinness-NOAA TerrenceMcGuinness-NOAA commented Feb 20, 2026

Add Comprehensive GitLab CI/CD Pipeline Documentation

Summary

Adds a new ReadTheDocs page documenting the full GitLab CI/CD pipeline infrastructure for global-workflow. This fills a significant documentation gap — until now, the mirroring setup, pipeline architecture, runner deployment, and maintenance procedures were only known through tribal knowledge and inline code comments.

Motivation

The CI/CD infrastructure spans multiple GitLab instances, five RDHPCS platforms, and a GitHub Actions bridge, with configuration spread across ~15 files. New developers and maintainers had no single reference to understand how the pieces fit together. This documentation provides that reference.

What's Included

New Files

File Description
docs/source/ci_cd_pipeline.rst ~900-line comprehensive RST document
docs/source/_static/ci_cd_architecture.svg Architecture diagram (SVG)

Modified Files

File Change
docs/source/index.rst Added ci_cd_pipeline to the toctree (after testing)

Documentation Sections

The RST document covers the following topics in depth:

  1. Overview — Why GitLab CI/CD is used alongside GitHub; key design principles
  2. Repository Mirroring — Two-stage mirror chain:
    • Pull mirror (Licensed GitLab Premium) — pulls from GitHub (advanced feature requiring paid tier)
    • Push mirror (Licensed → VLab Community) — propagates to the community instance where CI/CD actually runs
    • Mirror chain summary with ASCII diagram
  3. Pipeline Architecture — Four stages (Build → Setup → Run → Finalize), two modalities (PR Cases via Rocoto, CTests via CMake), per-host test matrices for all 5 platforms, pipeline variables reference
  4. GitHub Actions Integration — How trigger-gitlab-pipelines.yml bridges GitHub PRs to GitLab pipelines, including secrets, label lifecycle (CI-Ready-to-RunCI-{host}-RunningCI-{host}-{Pass/Fail}), and manual trigger instructions
  5. Nightly Operations — Scheduled pipelines, develop branch testing, badge generation
  6. GitLab Runner Setuplaunch_gitlab_runner.sh usage (register/run/unregister), platform-specific configurations for all 5 hosts, directory layout, maintenance procedures
  7. Pipeline Execution Detailsci_utils.sh functions, run_check_gitlab_ci.sh experiment monitoring, timeout handling
  8. Adding New Hosts — Step-by-step guide for onboarding a new RDHPCS platform
  9. File Reference — Complete table of all CI/CD-related files with locations and purposes
  10. Troubleshooting — Common failure scenarios and resolutions

Architecture Diagram

The SVG diagram visually shows:

  • The three-repository mirror chain (GitHub → Licensed GitLab → VLab Community GitLab)
  • GitHub Actions API trigger path (GitHub → VLab Community)
  • Feedback loop (labels, comments, badges back to GitHub)
  • Four pipeline stages with descriptions
  • All five RDHPCS runner platforms with test case counts and tags
  • Key configuration files

Important architectural note: The Licensed GitLab instance is used only for mirroring (pull from GitHub, push to VLab). The VLab Community GitLab instance is where the CI/CD pipelines actually execute and where runners are registered.

ReadTheDocs Preview

https://global-workflow--4576.org.readthedocs.build/en/4576/ci_cd_pipeline.html

Notes for Reviewers

  • The document references actual file paths, variable names, and script functions from the codebase — please verify these are still accurate
  • Test case counts per host (Hera: 17, Gaea C6: 15, Orion: 8, Hercules: 10, Ursa: 17) were sourced from gitlab-ci-hosts.yml — confirm these are current
  • The mirroring description reflects that pull mirroring requires GitLab Premium and the community VLab instance runs the actual pipelines/runners
  • No functional code changes — documentation only

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive documentation for the GitLab CI/CD pipeline infrastructure used by the global-workflow project. The documentation explains the repository mirroring strategy between GitHub and GitLab, pipeline architecture, runner deployment on RDHPCS systems, and maintenance procedures. It includes a detailed SVG architecture diagram that visually illustrates the CI/CD flow from GitHub through GitLab to the HPC runners.

Changes:

  • Added comprehensive GitLab CI/CD pipeline documentation covering all aspects of the CI infrastructure
  • Created an SVG architecture diagram showing repository mirroring and pipeline execution flow
  • Updated documentation index to include the new CI/CD pipeline documentation

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File Description
docs/source/index.rst Added ci_cd_pipeline.rst to the documentation table of contents
docs/source/ci_cd_pipeline.rst Comprehensive 877-line documentation covering repository mirroring, pipeline architecture, runner setup, execution details, and troubleshooting
docs/source/_static/ci_cd_architecture.svg Professional SVG diagram (229 lines) illustrating the complete CI/CD architecture with GitHub, GitLab instances, pipeline stages, and RDHPCS runners

Comment on lines +63 to +76
Key Design Principles
=====================

- **GitHub is authoritative**: All development happens on GitHub
(``https://github.com/NOAA-EMC/global-workflow``). GitLab is used solely as
a CI execution platform.
- **Two-tier mirroring**: A licensed GitLab instance performs the pull mirror from
GitHub, and subsequently push mirrors to the NOAA community GitLab instance.
- **HPC-native testing**: Runners execute directly on the target HPC nodes,
ensuring tests build and run against the real Spack-Stack software environment.
- **Multi-modal pipelines**: The system supports both comprehensive end-to-end
experiment cases and fast CTest-based functional checks.
- **GitHub feedback loop**: Pipeline results flow back to GitHub through PR labels,
PR comments (including error log gists), and status badges.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of extraneous information in all of this documentation. Let's aim to keep this to what's important and not be too redundant.

Suggest removing this.

Suggested change
Key Design Principles
=====================
- **GitHub is authoritative**: All development happens on GitHub
(``https://github.com/NOAA-EMC/global-workflow``). GitLab is used solely as
a CI execution platform.
- **Two-tier mirroring**: A licensed GitLab instance performs the pull mirror from
GitHub, and subsequently push mirrors to the NOAA community GitLab instance.
- **HPC-native testing**: Runners execute directly on the target HPC nodes,
ensuring tests build and run against the real Spack-Stack software environment.
- **Multi-modal pipelines**: The system supports both comprehensive end-to-end
experiment cases and fast CTest-based functional checks.
- **GitHub feedback loop**: Pipeline results flow back to GitHub through PR labels,
PR comments (including error log gists), and status badges.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will refine.

Comment on lines +86 to +169
Pull Mirroring (Licensed GitLab Instance)
==========================================

The first stage uses **pull mirroring**, a feature that is only available on
licensed (paid) tiers of GitLab (Premium or Ultimate). A single licensed GitLab
instance is configured to pull from the authoritative GitHub repository:

.. list-table:: Pull Mirror Configuration
:widths: 25 75
:header-rows: 1

* - Setting
- Value
* - **Source repository**
- ``https://github.com/NOAA-EMC/global-workflow.git``
* - **Direction**
- Pull
* - **Scope**
- All branches
* - **Sync frequency**
- Automatic (every few minutes)

The licensed instance's sole purpose is **mirroring** — it does not run any
CI/CD pipelines itself. Its pull mirror keeps the GitLab copy synchronized with
GitHub, and its push mirror (described below) propagates changes onward.

.. note::

Pull mirroring is an **advanced feature** available only on licensed instances
of GitLab (Premium tier and above). It is not available on GitLab Community
Edition (CE) or the free tier. This is why a separate licensed instance is
required for the first stage of the mirror chain.

Push Mirroring (Community GitLab at VLab)
=========================================

The second stage uses **push mirroring** from the licensed GitLab instance to
the NOAA community GitLab instance hosted at VLab:

.. list-table:: Push Mirror Configuration
:widths: 25 75
:header-rows: 1

* - Setting
- Value
* - **Target repository**
- ``https://vlab.noaa.gov/gitlab-community/NWS/Operations/NCEP/EMC/global-workflow.git``
* - **Direction**
- Push
* - **Scope**
- All branches
* - **Sync frequency**
- Automatic (every few minutes)

The VLab community GitLab instance is where the **CI/CD pipelines actually
execute**. GitLab runners deployed on RDHPCS systems register against this
instance, and all pipeline stages (build, setup, test, finalize) run here.
This instance also provides the broader NOAA user community with read access
to the repository.

Mirror Chain Summary
====================

The complete mirror chain is::

GitHub (authoritative)
│ Pull Mirror (licensed GitLab feature)
Licensed GitLab Instance (mirroring only)
│ Push Mirror (available on all GitLab tiers)
VLab Community GitLab (CI/CD pipelines execute here, NOAA-wide access)

Both mirrored repositories track **all branches**, ensuring that any branch pushed
to GitHub (including PR branches fetched during pipeline execution) is available
for CI testing.

.. important::

Developers should **never push directly** to either GitLab instance. All code
changes must flow through GitHub. The GitLab mirrors are read-only copies
maintained by the mirroring configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all can just be summarized. No need for the details here. Just

  • mirror from github to VLab gitlab
  • mirror from VLab gitlab to public gitlab

is enough. Having the URLs for all of these would be useful, but beyond that is extraneous.

Comment on lines +264 to +267
1. ``cmake -S "${GW_HOMEgfs}"`` — Configure the CTest build
2. ``ctest -N`` — List available tests
3. ``ctest -L "${CTEST_NAME}"`` — Run tests matching a specific label
4. JUnit XML results are published as GitLab artifacts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way to jargony. We don't need to know the internals. Please put this into laymen's terms.

Comment on lines +279 to +312
.. list-table:: Test Cases by Platform
:widths: 15 85
:header-rows: 1

* - Platform
- Test Cases
* - **Hera**
- C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA,
C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48_hybatmsnowDA,
C96C48_hybatmsoilDA, C96C48_ufsgsi_hybatmDA, C96C48_ufs_hybatmDA,
C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar, C96_gcafs_cycled,
C96_gcafs_cycled_noDA, C96mx100_S2S, C48_gsienkf_atmDA,
C48_ufsenkf_atmDA
* - **Gaea C6**
- C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA,
C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48_hybatmsnowDA,
C96C48_hybatmsoilDA, C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar,
C96_gcafs_cycled, C96_gcafs_cycled_noDA, C96mx100_S2S,
C48_gsienkf_atmDA, C48_ufsenkf_atmDA
* - **Orion**
- C48_ATM, C48_S2SW, C48_S2SWA_gefs, C96C48_hybatmDA,
C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar, C96mx100_S2S,
C96_gcafs_cycled
* - **Hercules**
- C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA,
C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48mx500_S2SW_cyc_gfs,
C96_atm3DVar, C96mx100_S2S, C96_gcafs_cycled
* - **Ursa**
- C48_ATM, C48_S2SW, C48_S2SWA_gefs, C48mx500_3DVarAOWCDA,
C48mx500_hybAOWCDA, C96C48_hybatmDA, C96C48_hybatmsnowDA,
C96C48_hybatmsoilDA, C96C48_ufsgsi_hybatmDA, C96C48_ufs_hybatmDA,
C96C48mx500_S2SW_cyc_gfs, C96_atm3DVar, C96mx100_S2S,
C96_gcafs_cycled, C96_gcafs_cycled_noDA, C48_gsienkf_atmDA,
C48_ufsenkf_atmDA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't belong here. This is too specific and would be a bear trying to keep updated. Just paths to where this is controlled is needed.

Comment on lines +314 to +344
Pipeline Variables
==================

The following variables control pipeline behavior and can be set from
GitLab scheduled pipelines, GitHub Actions triggers, or the GitLab web UI:

.. list-table:: Key Pipeline Variables
:widths: 25 15 60
:header-rows: 1

* - Variable
- Default
- Description
* - ``PIPELINE_TYPE``
- ``pr_cases``
- Testing modality: ``pr_cases`` or ``ctests``
* - ``GFS_CI_RUN_TYPE``
- ``pr_cases``
- Run classification: ``pr_cases`` or ``nightly``
* - ``RUN_ON_MACHINES``
- ``all``
- Space-separated list of machines or ``all``
* - ``PR_NUMBER``
- ``0``
- GitHub PR number (``0`` = develop branch)
* - ``GITHUB_COMMIT_SHA``
- (empty)
- PR head commit SHA for GitLab native GitHub integration
* - ``GW_REPO_URL``
- ``https://github.com/NOAA-EMC/global-workflow.git``
- Authoritative GitHub repository URL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this after the GitHub Actions Integration section.

Comment on lines +603 to +613
The registration command configures the runner with:

- **Executor**: ``shell`` (runs directly in the HPC environment)
- **Shell**: ``bash``
- **Builds directory**: ``${GITLAB_BUILDS_DIR}`` (from platform config)
- **Custom build directory**: enabled (allowing ``.gitlab-ci.yml`` to override
the clone path via ``GIT_CLONE_PATH``)
- **Concurrency**: 24 concurrent requests

After registration, the script updates the runner's ``config.toml`` to set
``concurrent = 24``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this important? If it isn't, please remove. If it is important, can you please further describe what concurrency is?

Comment on lines +659 to +661
└── Jenkins/ # Legacy Jenkins directories
├── agent/
└── workspace/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary.

Suggested change
└── Jenkins/ # Legacy Jenkins directories
├── agent/
└── workspace/

Comment on lines +663 to +664
Runner Maintenance
==================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important to note that this needs to be done on the same head node that it was initially launched on.


development.rst
testing.rst
ci_cd_pipeline.rst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to the end.

@TerrenceMcGuinness-NOAA TerrenceMcGuinness-NOAA added the CI/CD Issue related to CI/CD label Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD Issue related to CI/CD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants